Anatoliy Plastinin’s Blog – Sampling Random Thoughts

LLMs for Rubyists (and actually everyone)

LLM

llama.cpp

data extraction

json

ruby

It may seems that python ecosystem dominates AI/LLMs field. Though in this post we’ll see how anyone can use it in their projects, using structured data extraction as an example.

Structured data extraction using LLMs

LLM

llama.cpp

data extraction

json

In this post you’ll learn how to extract structured data from html (or any other textual data)

Getting started with Terraform locally

Terraform

localstack

Docker

Quick demo on how to start playing with terraform using local development environment

Using Spark SQL and Spark Streaming together

Spark

Spark Streaming

Spark SQL

Kafka

Docker

JSON

Tutorial on how to get started with Spark SQL, Spark Streaming and Kafka using Docker

spark-shell without Spark

Spark

Scala

Small trick how to start playing with Spark APIs without having spark distribution installed

When Data Driven App Smells Bad

anti-patterns

big data

architecture

What can go wrong with data driven projects. Lessons learned from failed project.

Spark SQL and Parquet files

Spark

parquet

Tiny note on how to deal with Parquet files with Spark

Processing JSON data with Spark SQL

Spark

Spark SQL

JSON

Deep dive into JSON support in Spark SQL

How to Write Data into Parquet

parquet

An example of how to write data into Apache Parquet format

Running spark-shell in browser with Apache Mesos and Marathon

Spark

Mesos

Marathon

Small trick on how to run spark-shell a web app using Mesos and Marathon.

Getting started with Spark Streaming using Docker

Spark

Kafka

Docker

Step-by-step guide on how to get started with Spark Streaming and Kafka using Docker environment