Handle non-sensical operations to avoid downstream errors

When attempting to log-transform an array of values with NumPy, keep in mind Given negative numbers and zeroes, NumPy will output NaN and -inf, respectively, along with a RuntimeWarning. Such values can cause downstream processing to fail or behave unexpectedly. numpy.log provides an argument to handle this situation How that argument affects numpy.log’s behavior depends on whether the output goes to a preexisting container or if that container is created on the fly....

January 5, 2023 · Aaron Slowey

That which is aggregated and its metadata

It’s impossible to include an associated field value alongside an aggregate of another variable Unlike ndarrays, DataFrames are often heterogeneous. They are a more complete map of how we think of a data set as a whole. When we alter the structure of tabular data, often through aggregation of one field, we want to include values from other fields. This is an example of an issue that arises at the interface of pandas and scikit-learn, for which the ColumnTransformer was created....

August 12, 2022 · Aaron Slowey