I created a public Telegram channel that shares news and posts about Data Science, Machine Learning, and AI that I find interesting.
It may seems that python ecosystem dominates AI/LLMs field. Though in this post we’ll see how anyone can use it in their projects, using structured data extraction as an example.
In this post you’ll learn how to extract structured data from html (or any other textual data)
Quick demo on how to start playing with terraform using local development environment
Tutorial on how to get started with Spark SQL, Spark Streaming and Kafka using Docker
Small trick how to start playing with Spark APIs without having spark distribution installed
What can go wrong with data driven projects. Lessons learned from failed project.
Tiny note on how to deal with Parquet files with Spark
Deep dive into JSON support in Spark SQL
An example of how to write data into Apache Parquet format
Small trick on how to run spark-shell a web app using Mesos and Marathon.
Step-by-step guide on how to get started with Spark Streaming and Kafka using Docker environment