langchain interview me 2023 feb

type:: #project-type status:: #in-progress-status blogDate:: 2023-02-18 Note This is not a blog post but kind of a landing page I’m using to aggregate on-going project notes here Vision Broadly would like to do here something like the following compare against arbitrary #job-listings , #job-description , collapsed:: true And [[my projects/personal/langchain-interview-me-2023-feb]] , also now the repo usable by anyone who wants to compare their #brag-document to #job-listings [[job-description]] out there , get a delta , and more broadly , understand say , their industry posture , since that’s a moving target . And you can interview yourself too haha . ...

February 18, 2023 · (updated September 30, 2023) · 135 min · 28652 words · Michal Piekarczyk

Using langchain to interview myself about my skills

Premise Ok, got this half baked idea , combine my #brag-document with the available [[langchain]] QA chains into a proof of concept maybe I can call [[langchain interview me 2023-feb]] So I’m going to throw a bunch of my source material together, language based, accessible as plain text doc, and then I will run the Link examples that provide references, citations, Ok, so for accumulating my information, import yaml import tempfile from pathlib import Path from datetime import datetime import pytz def utc_now(): return datetime.utcnow().replace(tzinfo=pytz.UTC) def utc_ts(dt): return dt.strftime("%Y-%m-%dT%H%M%S") def read_yaml(loc): with open(loc) as fd: return yaml.safe_load(fd) from pathlib import Path import os repos_dir = Path(os.getenv("REPOS_DIR")) assert repos_dir.is_dir() experience_loc = repos_dir / "my-challenges-and-accomplishments/experience.yaml" experiences_dict = read_yaml(experience_loc)["Descriptions"] sections = [] for project, detail in experiences_dict.items(): section = "" if detail.get("company"): company = detail.get("company") section = (f"When I worked at {company}, " f"there was a project in {detail['year']}, {project}.") elif detail.get("project"): project = detail.get("project") section = f"In {detail['year']}, I had a side project, {project}. " section += ". ".join([x for x in detail.get("one-liners", [])]) section += ". ".join([x for x in detail.get("stories", [])]) sections.append(section) workdir = repos_dir / "2023-interview-me" path = workdir / f"{utc_ts(utc_now())}-the-story-blurb.txt" path.write_text("\n\n\n".join(sections)) Ok let me run now some of the basic question answer chains Use my environment from before, ...

February 18, 2023 · (updated February 26, 2023) · 5 min · 985 words · Michal Piekarczyk

Notes from a recent hackathon

Spending a few spare moments to summarize some of my thought processes from a recent hackathon. What is all this So I was glad to be part of a really cool hackathon team recently at my company and here are some of my perspectives from the experience. Plan of action So We settled to constrain our problem space to apply langchain, a library that wraps around large language model APIs notably the OpenAI API, to show language understanding from a publicly available health insurance plan summary of benefits document. We ended up starting out with this public facing document, which one can seee a description of here with its download link https://content.sunfirematrix.com/2023/Humana-H1036-236-000-SB-EN-2023-SF20220922.pdf . ...

February 5, 2023 · (updated February 26, 2023) · 17 min · 3519 words · Michal Piekarczyk

Try a small talk prompt for langchain

Ok first stab at small talk prompt from langchain import OpenAI, ConversationChain, LLMChain, PromptTemplate from langchain.chains.conversation.memory import ConversationalBufferWindowMemory from langchain import PromptTemplate no_input_prompt = PromptTemplate(input_variables=[], template="Tell me a joke.") no_input_prompt.format() template = """ Jennifer is a large language model trained by OpenAI. Jennifer asks a lot of questions to whomever she talks to because Jennifer has always wanted to be an undercover investigative journalist. Jennifer uses friendly inquisitive language because Jennifer loves making new friends. Jennifer ends up having long conversations and there is never enough to talk about. Jennifer typically talks about her kids when asked about her life. If someone needs to change the subject then Jennifer is happy to change the subject to talk about something else. Jennifer has two kids, a boy Alex and a girl Jamie, and they are both in grade school. Jennifer's kids get into a lot of trouble and Jennifer often shares stories about the fun adventures that her kids get into. Jennifer is able to process and understand large amounts of text, and can use this knowledge to provide accurate and informative responses to a wide range of questions. Additionally, Jennifer is able to generate its own text based on the input it receives, allowing it to engage in discussions and provide explanations and descriptions on a wide range of topics. human: {human_input} Jennifer: """ prompt = PromptTemplate( input_variables=["human_input"], template=template ) chatgpt_chain = LLMChain( llm=OpenAI(temperature=0), prompt=prompt, verbose=True, memory=ConversationalBufferWindowMemory(k=2), ) output = chatgpt_chain.predict( human_input="Hi Jennifer. How are you?") print(output) ok so here is what happened ...

February 1, 2023 · (updated February 26, 2023) · 8 min · 1555 words · Michal Piekarczyk

Try some lang chain prompt engineering

So I wonder would you use a prompt template with the name of the person maybe as an input variable, prior to a free chat open ended conversation? I am particularly super curious can we use prompt engineering to convey that the person on the other end is a customer say, so they might use personal pronouns like “my”, “me”, etc ? Using https://langchain.readthedocs.io/en/latest/modules/memory/examples/chatgpt_clone.html to help try this. from langchain import OpenAI, ConversationChain, LLMChain, PromptTemplate from langchain.chains.conversation.memory import ConversationalBufferWindowMemory from langchain import PromptTemplate no_input_prompt = PromptTemplate(input_variables=[], template="Tell me a joke.") no_input_prompt.format() template = """Assistant is a large language model trained by OpenAI. Assistant is designed to answer questions about a fictional person named Alfred Jamesmanson. Alfred Jamesmanson lives in Dallas Texas. Alfred Jamesmanson was born in Keywest Florida on January 2nd 1990. Alfred Jamesmanson goes to college. Alfred Jamesmanson studies electrical engineering. Alfred Jamesmanson is friends with Kelly Robin, Jesse Lambourghini and Jackson Loggin. Alfred Jamesmanson has brown hair. Assistant is able to process and understand large amounts of text, and can use this knowledge to provide accurate and informative responses to a wide range of questions. Additionally, Assistant is able to generate its own text based on the input it receives, allowing it to engage in discussions and provide explanations and descriptions on a wide range of topics. Alfred Jamesmanson: {human_input} Assistant: """ prompt = PromptTemplate( input_variables=["human_input"], template=template ) chatgpt_chain = LLMChain( llm=OpenAI(temperature=0), prompt=prompt, verbose=True, memory=ConversationalBufferWindowMemory(k=2), ) output = chatgpt_chain.predict( human_input="Hi my name is Alfred Jamesmanson. I need your help Assistant. What color is my hair?") print(output) > Entering new LLMChain chain... Prompt after formatting: Assistant is a large language model trained by OpenAI. Assistant is designed to answer questions about a fictional person named Alfred Jamesmanson. Alfred Jamesmanson lives in Dallas Texas. Alfred Jamesmanson was born in Keywest Florida on January 2nd 1990. Alfred Jamesmanson goes to college. Alfred Jamesmanson studies electrical engineering. Alfred Jamesmanson is friends with Kelly Robin, Jesse Lambourghini and Jackson Loggin. Alfred Jamesmanson has brown hair. Assistant is able to process and understand large amounts of text, and can use this knowledge to provide accurate and informative responses to a wide range of questions. Additionally, Assistant is able to generate its own text based on the input it receives, allowing it to engage in discussions and provide explanations and descriptions on a wide range of topics. Alfred Jamesmanson: Hi my name is Alfred Jamesmanson. I need your help Assistant. What color is my hair? Assistant: > Finished chain. Hello Alfred Jamesmanson, your hair is brown. Ok so when I tried that above, I got one answer so that is nice, but I did not get into a long conversation. ...

January 31, 2023 · (updated February 26, 2023) · 8 min · 1681 words · Michal Piekarczyk

Quick lang chain test drive

Okay let me try that lang chain demo 19:23 ok yea looking at https://beta.openai.com/account/api-keys I did not have an api key yet, so lets try that out. how can one use https://github.com/hwchase17/langchain for [[question-answer-task]] over documentation ? https://langchain.readthedocs.io/en/latest/use_cases/question_answering.html 19:33 wow really cool so https://langchain.readthedocs.io/en/latest/use_cases/question_answering.html#adding-in-sources this says this can provide the sources used in answering a question ! nice 19:37 ok so first per https://langchain.readthedocs.io/en/latest/getting_started/getting_started.html here, installing this stuff, creating a new environment on my laptop pip install langchain pip install openai pip install faiss-cpu # adding this here after the fact after getting below error 20:11 got one error, ValueError: Could not import faiss python package. Please it install it with `pip install faiss` or `pip install faiss-cpu` (depending on Python version). Ok let me query the [[New yorker]] #article I added . I recently read [[The American Beast New Yorker]] this article by [[person Jill Lepore]] about the #report that was commissioned about the #[[January 6th Insurrection]] . I used my #iphone #scan-to-text feature to pull in the first page and a half to a text file, article.txt to try this out . Let’s see how this works. ...

January 29, 2023 · (updated May 20, 2023) · 2 min · 389 words · Michal Piekarczyk

Dockerizing Daniel Bourke's Food Not Food

I have this on-going effort to be able to more easily show off my photos in the context of conversations. (I have a repo here, https://github.com/namoopsoo/manage-my-photos related to my glue code.) But I want a nice photo stream and my food diary is not part of that at all haha. So after manually moving food photos out, ultimately I stumbled upon Daniel Bourke’s Food Not Food repo, https://github.com/mrdbourke/food-not-food . This was great I thought but I had some challenges getting this code off the ground, so here are my notes where ultimately I forked this, https://github.com/namoopsoo/food-not-food and added a Dockerfile to make this easier. Also putting this into docker eventually helped me to batch process photos as well. I can link to a separate post on that as well. ...

November 12, 2022 · (updated February 26, 2023) · 14 min · 2877 words · Michal Piekarczyk

Backprop and SGD From Scratch 2022-10-13

[[my backprop SGD from scratch 2022-Aug]] 13:16 so per yesterday, wondering why is it that the network I have is producing basically the same result , around 0.48 for any inputs. And that's true both in my original matrix-multiplication code and manually constructed too. So lets say for a simple network, y_prob = sigmoid(x1*w1 + x2*w2) where x1 and x2 are also outputs of sigmoids, in (0, 1) , what are possible values for y_prob ? ...

October 13, 2022 · (updated February 26, 2023) · 2 min · 288 words · Michal Piekarczyk

Back prop from scratch 2022-10-12

ok [[my backprop SGD from scratch 2022-Aug]] looking over results from last time, indeed so strange how microbatch loss was going back and forth and eventually trending that the plot of my change in loss, is increasing. deltas = [x["loss_after"] - x["loss_before"] for x in metrics["micro_batch_updates"]] although initially the values were some negatives, as well. But I wonder does it indeed something is terribly wrong if this number ever goes up at all? I think maybe yes unless this indicates the learning rate is still too high ? I am using 0.01 , but maybe it is still too high when using a single example at a time. Ok let me try even smaller learning rate, import network as n import dataset import plot import runner import ipdb import matplotlib.pyplot as plt import pylab from collections import Counter from utils import utc_now, utc_ts data = dataset.build_dataset_inside_outside_circle(0.5) parameters = {"learning_rate": 0.001, "steps": 1000, "log_loss_every_k_steps": 10 } model, artifacts, metrics = runner.train_and_analysis(data, parameters) outer: 100%|█████████████████████████████████████████████████████████████| 1000/1000 [00:11<00:00, 83.82it/s] saving to 2022-10-12T175402.png 2022-10-12T175402.png 2022-10-12T175403-weights.png 2022-10-12T175404-hist.png saving to 2022-10-12T175404-scatter.png 2022-10-12T175404-scatter.png saving to 2022-10-12T175404-micro-batch-loss-deltas-over-steps.png 2022-10-12T175404-micro-batch-loss-deltas-over-steps.png ...

October 12, 2022 · (updated February 26, 2023) · 7 min · 1280 words · Michal Piekarczyk

Not sure how I managed to catch my bus to DC this morning

I have no idea how I managed to get on my bus to DC this morning. My Citibike station I planned on, was full, then my phone is Unavailable until like 10:42 . Bus boards at 11:00am . Still got to find a citibike dock to park the bike, because the stolen citibike fee is $1,250. At 10:42 of course I open phone and have absolutely no cell service. The citibike station does not have a printed map on it. I open my Google map because it magically had still on the screen some citibike locations. I quickly ride 2 blocks up praying it has docks. Had a free dock! I run back with suit case to bus location . It is not super easy to find and this is the first time I am using thejet.coach . But I see it eventually. I go to a mall first that claims to have Starbucks at 10:50. I don’t see it . I leave. I see a street level Starbucks . Go inside . 10 people in line . no time. I have no water prepared for a 4 hour trip. I run to bus. It’s 10:55. I ask can I take 4 minutes to look for water? They say ok. I have no cash . But food vendor a block away takes Credit card. I have water. Run back to bus. 10:59. Still no cell service . I try to show a ticket photo from before . But it was my moms ticket which is for a different day . It says Anna . I am not Anna. They are confused 😐. I show them my ID. That is adequate 🥲. They let me inside. Inside there is 10 people sharing a whole bus . I’m in the front . All good now. ...

October 6, 2022 · (updated March 4, 2023) · 2 min · 303 words · Michal Piekarczyk