Beat dirty data

series: NO DATA SCIENTIST IS THE SAME! – part 5 All Python notebooks from this series are available on our Gitlab page.

Hyperparameter tuning for hyperaccurate XGBoost model

series: NO DATA SCIENTIST IS THE SAME! – part 4 All Python notebooks from this series are available on our Gitlab page.

Good model by default using XGBoost

series: NO DATA SCIENTIST IS THE SAME! – part 3 All Python notebooks from this series are available on our Gitlab page.

Data to predict which employees are likely to leave

series: NO DATA SCIENTIST IS THE SAME! – part 2 All Python notebooks from this series are available on our Gitlab page.

Introducing our Data Science Rock Stars

series: NO DATA SCIENTIST IS THE SAME! – part 1 All Python notebooks from this series are available on our Gitlab page.

Forecasting Dutch marriages after Covid-19

In this blog we use timeseries analyses to model the trend in Dutch marriages and extrapolate this to the first two Covid-19 years (2020 and 2021). Doing so we are able to shed some light on the expected amount of marriages postponed, and what might be ahead of us in 2022, of course dependent on local restrictions next year. We use public available data provided by Statistics Netherlands on the number of marriages in past years.

Save any type of file from Azure Synapse Notebook on Azure Data Lake Gen2

If you are more a Data Scientist than a Data Engineer, you’ve just started working in Azure Synapse Analytics Studio and you feel lost and frustrated every now and again, I feel you and I’m here for you. I can only hope you are as lucky as I am, having some very skilled Data Engineering […]

What we Learned from Kaggle’s CommonLit Readability Prize 

What did we learn from joining Kaggle’s Commonlit Readability prize? Well, next to having a lot of fun, it definitely helped us to explore the text analytics landscape even further. You can read our whole story in this article

Can machine learning assess the readability of texts?

For our latest Project Friday, we entered the CommonLit Readability competition on Kaggle. We learned a lot from working together on this cool NLP task. Stay tuned to see where this led us.