Saturday 30 September 2023

Friday 29 September 2023

Thursday 28 September 2023

Wednesday 27 September 2023

New top story on Hacker News: Show HN: Using LLMs and Embeddings to classify application errors
Show HN: Using LLMs and Embeddings to classify application errors
19 by vadman97 | 0 comments on Hacker News.
Hi Hacker News! We’re Vadim and Chris from Highlight.io [1]. We do web app monitoring and are working on using LLMs/embeddings to add new functionality to our error monitoring product. Given that there’s a lot of founders/engineers using LLMs in their products, we figured we’d share how we built the new functionality, their impact on our workflows, and how you can try it out. Our goal was to build two features: (1) tagging errors (e.g. deeming an error as “authentication error” or a “database error”); and (2) grouping similar errors together (e.g. two errors that have a different stacktrace and body, but are semantically not very different). Each of these rely heavily on comparing text across our application. After some experimentation with the OpenAI embeddings API [3], we went ahead and hosted a private model instance of thenlper/gte-large (an open-source MIT licensed model), which is a 1024-dimension model running on an Intel Ice Lake 2 vCPU machine on Hugging face [4]. Our general approach for classifying/comparing text is as follows. As each set of tokens (i.e a string) comes in, our backend makes a request to an inference endpoint and receives a 1024-dimension float vector as a response (see the code here [5]). We then store that vector using pgvector [6]. To compare any two sets for similarity, we simply look at the Euclidian distance between their respective embeddings using the ivfflat index implemented by pgvector (example code here [7]). To tag errors, we assign an error its most relevant tag from a predetermined set decided by us. For example, if we tag an error as an "authentication error" or a "database error", we can allow developers to have a starting point before inspecting an issue.(see the logic here [8]). Anecdotally, this approach seems to work very well. For example, here are two authentication errors that got tagged as “Authentication Error”: * Firebase: A network AuthError has occurred * Error retrieving user from firebase api for email verification: cannot find user from uid. We also use these error embeddings to group similar errors. To decide whether an error joins a group or starts a new one, we decide on a distance threshold (using the euclidean distance) ahead of time. An interesting thing about this approach, compared to using a text-based heuristic, is that two errors with different stack traces can still be grouped together. Here’s an example: * github.com/highlight-run/highlight/backend/worker.(*Worker).ReportStripeUsage * github.com/highlight-run/highlight/backend/private-graph/graph.(*Resolver).GetSlackChannelsFromSlack.func1 Both reported as `integration api error` as they involve the Stripe and Slack integrations respectively. The neat thing is that the LLM can use the full context of an error and match based on the most relevant details about the error. We have rolled out a first version of the error grouping logic to our cloud product [9], and there’s a demo of all the functionality at [2]. Long-term, if the HN community has other ideas of what we could build with LLM tooling in observability, we’re all ears. Let us know what you think! Links [1] https://ift.tt/NUlVxBm [2] https://ift.tt/fVMY7jE [3] https://ift.tt/y1WXcHQ [4] https://ift.tt/PGSJykc [5] https://ift.tt/5tpxm6n... [6] https://ift.tt/jzUvcqR... [7] https://ift.tt/OTQFuyk... [8] https://ift.tt/mwTeO6J... [9] https://ift.tt/k6SdbYn

Tuesday 26 September 2023

Monday 25 September 2023

Friday 22 September 2023

Thursday 21 September 2023

Wednesday 20 September 2023

Tuesday 19 September 2023

New top story on Hacker News: Show HN: Hydra - Open-Source Columnar Postgres
Show HN: Hydra - Open-Source Columnar Postgres
38 by coatue | 3 comments on Hacker News.
hi hn, hydra ceo here hydra is an open-source extension that adds columnar tables to Postgres for efficient analytical reporting. With Hydra, you can analyze billions of rows instantly without changing code. demo video (5 min): https://youtu.be/1yzxgb0Oyrw github repo: https://ift.tt/RptH6fm For 1.0 GA release, aggregate queries are over *60% faster* than Hydra beta due to aggregate vectorization. Spatial indexes (gin, gist, spgist, and rum indexes) and pg_hint_plan are now enabled for performance optimization. postgres is great, but aggregates can take minutes to hours to return results on large data sets. long-running analytical queries hog database resources and degrade performance. use hydra to run much faster analytics on postgres without changing code. for testing, try the hydra free tier to create a column postgres instance on the cloud. https://ift.tt/j92Ulfa

Monday 18 September 2023

Saturday 16 September 2023

Friday 15 September 2023

Thursday 14 September 2023

Wednesday 13 September 2023

Tuesday 12 September 2023

Monday 11 September 2023

Sunday 10 September 2023

New top story on Hacker News: Show HN: Erlmacs – a script to update your .emacs file for Erlang development
Show HN: Erlmacs – a script to update your .emacs file for Erlang development
9 by dlachausse | 0 comments on Hacker News.
erlmacs automatically configures and updates your .emacs file with support for the emacs mode that is included with Erlang/OTP. It frees you from having to locate the installation directory of Erlang/OTP and its bundled emacs mode. It is an escript that only depends upon Erlang/OTP and Emacs. Note: There is not much in the way of error checking at this moment, but it does make a backup of your .emacs files before any destructive operations.

Saturday 9 September 2023

Friday 8 September 2023

Thursday 7 September 2023

Wednesday 6 September 2023

Tuesday 5 September 2023

Monday 4 September 2023

Sunday 3 September 2023

Saturday 2 September 2023

Friday 1 September 2023