langchain_pgvector bug

I had a issue where , using the langchain_pgvector library, with the vectorstore.add_documents function call, which has worked for me before for a while but somehow, for a new collection, I’m trying to add thousands of documents, but only a handful get added, and without errors. Very weird . I didn’t see any useful patterns in the 4 out of 1022 documents that did get added, and pdb tracing through the code did not reveal any silent errors....

November 2, 2024 · (updated November 3, 2024) · 1 min · 188 words · Michal Piekarczyk

video game Stray Review

This contains spoilers. This isn’t really a review, but maybe just free writing about a game I really enjoyed, recently. Maybe that is a review haha not sure. This is a story mode action adventure explorer type like Borderlands, walking dead, life is strange, and many others. Where, you Control a protagonist, in this case a cat,, 🐈, interacting with other characters in the game, as well as exploring your world to solve mostly 3 -space geometry puzzles, of , how to navigate a cat , with jumps and crawl spaces....

October 25, 2024 · (updated November 17, 2024) · 8 min · 1629 words · Michal Piekarczyk

How to query pgvector data leveraging multiple indexes

First, an embedding table was created, using langchain pgvector. (TODO show that example). Initial query which was working, using, a chosen_embedding, of some uuid I randomly picked from the vector table, using the cosine similarity <=> as an ORDER BY, explain analyze with myblah(chosen_embedding, chosen_id) as ( values ( (SELECT embedding FROM langchain_pg_embedding WHERE id = '280aefd0-cb15-4a54-924d-aab37ee8a816' ), '280aefd0-cb15-4a54-924d-aab37ee8a816') ) SELECT substr(id, 0, 8) as id, substr(document, 0, 40) as doc, round( 1 - cast(embedding <=> chosen_embedding as numeric), 3 ) as score, cmetadata->'name' as name FROM langchain_pg_embedding, myblah WHERE id !...

September 14, 2024 · (updated September 16, 2024) · 2 min · 233 words · Michal Piekarczyk

string to int conversion nulling

I had this interesting situqtion, where I wanted to plot some numbers that were nested inside of struct columns. They were row counts in a delta table history output, but in any case, I tried to plot them, but my plot treated them as categories. Ok realizing they were strings, I cast them to integers, but then I got nulls. After a bit of trial and error I realized they were probably laerger than 32bit!...

September 13, 2024 · (updated September 16, 2024) · 1 min · 118 words · Michal Piekarczyk

postgresql , pgvector and indexing

placeholder. didnt get chance to write this up yet, but I had used langchain pgvector, to add embeddings to postgresql , I ran my queries, and noticed they were slow. I read about pgvector indexing, and on psql, noticed my embedding column was missing an index! I tried adding the HNSW index manually. Weird error about no dimensions on the vector column. ok learned need explicit dimension. Added it. Nice adding index worked....

September 7, 2024 · (updated September 9, 2024) · 1 min · 101 words · Michal Piekarczyk