The first step in digital signal processing is to get the signal into digital form. This involves converting an analogue (continuous) value into a digital (discrete) one. Discretisation in time is called sampling and discretisation in amplitude is called quantisation.
The second step is usually to analyse the signal in some way, and for signals that change over time, that requires short-term analysis.
Start with these blog posts
then watch the videos.
Reading
Handbook of phonetic sciences – Ch 20 – Intro to Signal Processing for Speech
Written for a non-technical audience, this gently introduces some key concepts in speech signal processing.
When we extract short sections out of a longer, continuous signal, we need to be careful. The very act of extracting (or “cutting out”) a short section can create artefacts (specifically, discontinuities) at the beginning and end of the frame that is extracted.
An artefact is something created by the processing that is not part of the original signal being analysed. It’s impossible to avoid artefacts, but we can minimise their effect by applying a tapered window, which is basically a “fade in and fade out”.
(Note: “artefact” is the most common modern British spelling and “artifact” is more common in North America. Both are correct.)