Filtering and resampling
Date
Wednesday, January 21, 2026Links of interest
Notes
Among other points, we discussed:
That filtering a dataset means refining data to:
- Remove errors.
- Ensure that features1 are relevant to your question.
- Reduce noise and/or outliers.
- Improve analytical efficiency.
The fact that filtering inevitably modifies the raw dataset. As such, it is important to either fitler in code or store filtered data separately from raw data.
The importance of documenting all filtering operations (ideally, in a README.md).
Various methods for filtering, including removing data by attribute value, detrending, smoothing, and isolating data by frequency.
Resampling, which refers to the process of generating one or more new data points from some sample2 .
Below are formal definitions of several different resampling methods: