Passing large dataframes with dbutils.notebook.run in databricks

(Note, originally posted this here but cross posting here too) Passing large dataframes with dbutils.notebook.run ! At one point when migrating databricks notebooks to be useable purely with dbutils.notebook.run, the question came up, hey dbutils.notebook.run is a great way of calling notebooks explicitly, avoiding global variables that make code difficult to lint and debug, but what about spark dataframes? I had come across this https://docs.databricks.com/notebooks/notebook-workflows.html#pass-structured-data nice bit of documentation about using the spark global temp view to handle name references to nicely shuttle around dataframes by reference, given that a caller notebook and a callee notebook share a JVM and theoretically this is instantaneous....

February 25, 2023 · (updated August 16, 2025) · 3 min · 528 words · Michal Piekarczyk

fact checking dark humor

So my friend and I were discussing the general topic of what happens to our employability as people once we reach the age of retirement. My feelings on this are I am hopeful and I think people can retain their skills and probably express different kinds of skills too. (some call this wisdom =D ). At this point my friend introduced some dark humor, citing this link with an actuarial table, saying that,...

February 18, 2023 · (updated February 26, 2023) · 1 min · 139 words · Michal Piekarczyk

langchain interview me 2023 feb

type:: #project-type status:: #in-progress-status blogDate:: 2023-02-18 Note This is not a blog post but kind of a landing page I’m using to aggregate on-going project notes here Vision Broadly would like to do here something like the following compare against arbitrary #job-listings , #job-description , collapsed:: true And [[my projects/personal/langchain-interview-me-2023-feb]] , also now the repo usable by anyone who wants to compare their #brag-document to #job-listings [[job-description]] out there , get a delta , and more broadly , understand say , their industry posture , since that’s a moving target ....

February 18, 2023 · (updated September 30, 2023) · 135 min · 28652 words · Michal Piekarczyk

Using langchain to interview myself about my skills

Premise Ok, got this half baked idea , combine my #brag-document with the available [[langchain]] QA chains into a proof of concept maybe I can call [[langchain interview me 2023-feb]] So I’m going to throw a bunch of my source material together, language based, accessible as plain text doc, and then I will run the Link examples that provide references, citations, Ok, so for accumulating my information, import yaml import tempfile from pathlib import Path from datetime import datetime import pytz def utc_now(): return datetime....

February 18, 2023 · (updated February 26, 2023) · 5 min · 985 words · Michal Piekarczyk

Notes from a recent hackathon

Spending a few spare moments to summarize some of my thought processes from a recent hackathon. What is all this So I was glad to be part of a really cool hackathon team recently at my company and here are some of my perspectives from the experience. Plan of action So We settled to constrain our problem space to apply langchain, a library that wraps around large language model APIs notably the OpenAI API, to show language understanding from a publicly available health insurance plan summary of benefits document....

February 5, 2023 · (updated February 26, 2023) · 17 min · 3519 words · Michal Piekarczyk