Best of Timur Bikmukhametov, PhD - LinkedIn Posts by Timur Bikmukhametov, PhD

Timur Bikmukhametov, PhD

Feb 4, 2025 at 12:02 PM

Advanced ML Hyperparameter Tuning: Best Method?

Bayesian vs Particle Swarm Optimization 👇

(for more ML tutorials and resources like this, subscribe to my newsletter: https://lnkd.in/gddXakxh)

1️⃣ 𝗕𝗮𝘆𝗲𝘀𝗶𝗮𝗻 𝗢𝗽𝘁𝗶𝗺𝗶𝘇𝗮𝘁𝗶𝗼𝗻 (𝗕𝗢)

𝗘𝗳𝗳𝗶𝗰𝗶𝗲𝗻𝘁 𝗳𝗼𝗿 𝗖𝗼𝘀𝘁𝗹𝘆 𝗘𝘃𝗮𝗹𝘂𝗮𝘁𝗶𝗼𝗻𝘀
BO works well when each hyperparameter evaluation (e.g., training a model) is computationally expensive.

𝗦𝗺𝗮𝗿𝘁 𝗦𝗲𝗮𝗿𝗰𝗵 𝗶𝗻 𝗖𝗼𝗺𝗽𝗮𝗰𝘁 𝗦𝗽𝗮𝗰𝗲𝘀
Performs best in low-dimensional hyperparameter spaces (typically fewer than 20 dimensions).

It uses past evaluations to sample promising points, making it sample-efficient (see the gif).

✅ 𝗕𝗲𝘀𝘁 𝗳𝗼𝗿:
↳ Cases with limited computational budgets.
↳ Cases when model evaluation is time-consuming.
↳ Tuning a small to moderate number of parameters.

2️⃣ 𝗣𝗮𝗿𝘁𝗶𝗰𝗹𝗲 𝗦𝘄𝗮𝗿𝗺 𝗢𝗽𝘁𝗶𝗺𝗶𝘇𝗮𝘁𝗶𝗼𝗻 (𝗣𝗦𝗢)
𝗛𝗮𝗻𝗱𝗹𝗲𝘀 𝗛𝗶𝗴𝗵 𝗗𝗶𝗺𝗲𝗻𝘀𝗶𝗼𝗻𝗮𝗹𝗶𝘁𝘆
PSO suites for high-dimensional hyperparameter spaces, exploring diverse regions effectively.

𝗣𝗮𝗿𝗮𝗹𝗹𝗲𝗹𝗶𝘇𝗮𝗯𝗹𝗲
Easily scales across multiple computational nodes, making it efficient for large-scale optimization tasks.

✅ 𝗕𝗲𝘀𝘁 𝗳𝗼𝗿:
↳ High-dimensional hyperparameter spaces
↳ Scenarios with abundant computational resources allowing for parallel evaluations.
↳ Problems where the evaluation cost is less critical compared to exploration breadth.

𝗛𝗼𝘄 𝘁𝗼 𝗖𝗵𝗼𝗼𝘀𝗲 𝘁𝗵𝗲 𝗥𝗶𝗴𝗵𝘁 𝗔𝗹𝗴𝗼𝗿𝗶𝘁𝗵𝗺
-> 𝗨𝘀𝗲 𝗕𝗮𝘆𝗲𝘀𝗶𝗮𝗻 𝗢𝗽𝘁𝗶𝗺𝗶𝘇𝗮𝘁𝗶𝗼𝗻 when each evaluation is costly and you're dealing with a manageable number of hyperparameters.

-> 𝗨𝘀𝗲 𝗣𝗮𝗿𝘁𝗶𝗰𝗹𝗲 𝗦𝘄𝗮𝗿𝗺 𝗢𝗽𝘁𝗶𝗺𝗶𝘇𝗮𝘁𝗶𝗼𝗻 if you have many hyperparameters to tune and the capacity for parallelized evaluations.

♻️ Share with network to show your interest in advanced ML practices!

P.S. Did you try PSO in your work?

▿ Show more

Timur Bikmukhametov, PhD

Mar 27, 2025 at 10:27 AM

Anomaly Detection in Time Series in real-time.

Use Mahalanobis Distance (steps, pros, cons)👇

🔥 More of 𝗔𝗱𝘃𝗮𝗻𝗰𝗲𝗱 𝗠𝗟 𝗖𝗼𝗻𝘁𝗲𝗻𝘁 𝗳𝗿𝗼𝗺 𝗺𝗲 here: https://lnkd.in/emb4cFCS

🧠 𝗪𝗵𝗮𝘁 𝗶𝘀 𝗠𝗮𝗵𝗮𝗹𝗮𝗻𝗼𝗯𝗶𝘀 𝗗𝗶𝘀𝘁𝗮𝗻𝗰𝗲?
MD measures the distance of a point from the mean of a distribution, accounting for correlations between features in multidimensional space.

For the distributions, distributions of the train data (with no anomalies) are taken.

The bigger the distance of a new point from the distribution means, the more likely it's an anomaly.

𝗛𝗼𝘄 𝘁𝗼 𝘂𝘀𝗲 𝗠𝗗 𝗳𝗼𝗿 𝗮𝗻𝗼𝗺𝗮𝗹𝘆 𝗱𝗲𝘁𝗲𝗰𝘁𝗶𝗼𝗻:
-> Train a model:
Use clean (anomaly-free) training data to calculate the mean vector and covariance matrix.

-> Compute MD for new points:
For each new data point, compute its MD using the trained model.

-> Set a threshold and flag anomalies:
Choose a statistical threshold (e.g., the 95th percentile of MD values in the training data).

🟢 𝗔𝗱𝘃𝗮𝗻𝘁𝗮𝗴𝗲𝘀:
↳ Can detect gradual and steep anomalies
↳ Easy to interpret and fast to compute (like PCA)
↳ Detects anomalies considering feature relationships

🔴 𝗗𝗶𝘀𝗮𝗱𝘃𝗮𝗻𝘁𝗮𝗴𝗲𝘀:
↳ Sensitive to errors in covariance matrix
↳ Needs prior handling of outliers in training data.
↳ Assumes data has a multivariate normal distribution.

🔥 More of 𝗔𝗱𝘃𝗮𝗻𝗰𝗲𝗱 𝗠𝗟 𝗖𝗼𝗻𝘁𝗲𝗻𝘁 𝗳𝗿𝗼𝗺 𝗺𝗲 here: https://lnkd.in/emb4cFCS

♻️ Repost to show your interest in Anomaly Detection!

P.S. Which methods do you use for anomaly detection?

▿ Show more

Timur Bikmukhametov, PhD

Mar 20, 2025 at 12:00 PM

4 cases when NOT to use Random Forest

(that'll save you 50% of ML modeling time)

🔥 More of 𝗔𝗱𝘃𝗮𝗻𝗰𝗲𝗱 𝗠𝗟 𝗖𝗼𝗻𝘁𝗲𝗻𝘁 𝗳𝗿𝗼𝗺 𝗺𝗲 here: https://lnkd.in/emb4cFCS

🟢 𝗪𝗵𝗲𝗻 𝗳𝗲𝗮𝘁𝘂𝗿𝗲𝘀 𝗮𝗻𝗱 𝘁𝗮𝗿𝗴𝗲𝘁 𝗿𝗲𝗹𝗮𝘁𝗶𝗼𝗻𝘀𝗵𝗶𝗽𝘀 𝗮𝗿𝗲 𝗺𝗼𝘀𝘁𝗹𝘆 𝗹𝗶𝗻𝗲𝗮𝗿
In this case, Random Forest will barely outperform Linear or Logistic Regression.

👉 Linear models:
↳ Train faster
↳ Are easier to tune
↳ Are more interpretable

🟢 𝗪𝗵𝗲𝗻 𝗱𝗮𝘁𝗮 𝗶𝘀 𝗻𝗼𝗶𝘀𝘆, 𝘀𝗽𝗮𝗿𝘀𝗲, 𝗮𝗻𝗱 𝗵𝗮𝘀 𝗹𝗼𝘄 𝘃𝗮𝗿𝗶𝗮𝗯𝗶𝗹𝗶𝘁𝘆
Random Forest likes when:
↳ Data having many distinct feature values
↳Low noise levels

For noisy and sparse data, simpler models (e.g., Linear Regression) often perform just as well.

🟢 𝗪𝗵𝗲𝗻 𝗺𝗼𝗱𝗲𝗹 𝗲𝘅𝘁𝗿𝗮𝗽𝗼𝗹𝗮𝘁𝗶𝗼𝗻 𝗶𝘀 𝗶𝗺𝗽𝗼𝗿𝘁𝗮𝗻𝘁
Random Forest doesn’t extrapolate well.

👉 Use smooth non-linear models instead:
↳ Neural Networks
↳ Gaussian Processes

🟢 𝗪𝗵𝗲𝗻 𝘆𝗼𝘂 𝗽𝗹𝗮𝗻 𝘁𝗼 𝘂𝘀𝗲 𝘁𝗵𝗲 𝗺𝗼𝗱𝗲𝗹 𝗳𝗼𝗿 𝗼𝗽𝘁𝗶𝗺𝗶𝘇𝗮𝘁𝗶𝗼𝗻
Random Forest’s piecewise constant structure makes optimization gradients noisy and unstable.

👉 In this case, use smooth non-linear models:
↳ Neural Networks
↳ Gaussian Processes
↳ Splines

🔥 More of 𝗔𝗱𝘃𝗮𝗻𝗰𝗲𝗱 𝗠𝗟 𝗖𝗼𝗻𝘁𝗲𝗻𝘁 𝗳𝗿𝗼𝗺 𝗺𝗲 here: https://lnkd.in/emb4cFCS

♻️ Repost to show your interest in Practical Machine Learning!

P.S. What’s your go-to alternative when Random Forest isn’t the right fit?

▿ Show more

Timur Bikmukhametov, PhD

Best Posts by Timur Bikmukhametov, PhD on LinkedIn

Related Influencers

Webflow

Tableau

Bishal Nandi ↗️

Jeremy Connell-Waite