batch a few book summaries

Drafting this here so far, before it leaves my brain. In, Feel Good Productivity, Ali Abdaal captured very well, the idea that sometimes what we do with the intention of winding down at the end of the day, does not actually achieve that purpose . and instead he challenges his readers, to paint, go for a walk, or to find the things that, really give you the relaxation that you actually want....

April 3, 2025 · (updated June 2, 2025) · 4 min · 697 words · Michal Piekarczyk

transformer architecture sweet spot

(DRAFT for now ) What is the transformer architecture? Let me try for a, hopefully a sweet spot explanation. A deep neural network, trained by back propagation, with language data, first by self supervised learning (aka pre-training) using Masked Language Modeling, and then by fine tuning, for tasks like text summarization, part of speech labeling, Name Entity Recognition labeling, question answering, translation, and others. Self supervision, by way of next token prediction or more generally masked language modeling , lets a model to be trained without human generated labels....

March 9, 2025 · (updated March 12, 2025) · 2 min · 234 words · Michal Piekarczyk

The local sediment

Recently went to check out some of the rock faces in New Haven Nice rocks! Cool dirt too! And in the neighborhood, across the street from the local rock climbing gym were some additional practice rocks.

March 8, 2025 · (updated April 6, 2025) · 1 min · 36 words · Michal Piekarczyk

Chopin Etude op 10 no 1 progress with voice memo and ffprobe and chatgpt

I’ve been recording my progress on Chopin Etude op 10 no 1, for a few years now and figured I’d see where I’m at now. Asked chatgpt for help. I’ve been recording to .m4a by mac voice memo. I just downloaded my files. I had been using a more or less consistent file naming. Chatgpt generated a quick script for polars ( because I had stopped using pandas about a year ago )....

February 22, 2025 · (updated March 1, 2025) · 3 min · 547 words · Michal Piekarczyk

build ground truth golden dataset for comparing embedding models faster with chromadb

Initially, thinking that I wanted to create this grand truth data set quickly, a started out by having a four loop and sampling data from my giant data set of documents, looking for matches to input queries, but this ended up being pretty slow and tedious. today I switched to just setting up a local index using chroma DB. and this ended up being extremely fast because I am not having to redo the embedding....

February 1, 2025 · (updated February 2, 2025) · 2 min · 255 words · Michal Piekarczyk