https://github.com/argmaxml/pgdl

Argmax's postgres vector similarity task
https://github.com/argmaxml/pgdl

deep-learning embeddings postgresql vector-search

Last synced: 9 months ago
JSON representation

Argmax's postgres vector similarity task

Host: GitHub
URL: https://github.com/argmaxml/pgdl
Owner: argmaxml
License: mit
Created: 2024-02-19T16:39:11.000Z (over 2 years ago)
Default Branch: master
Last Pushed: 2024-07-22T11:10:02.000Z (almost 2 years ago)
Last Synced: 2025-04-05T13:43:25.165Z (about 1 year ago)
Topics: deep-learning, embeddings, postgresql, vector-search
Language: Python
Homepage:
Size: 534 KB
Stars: 6
Watchers: 5
Forks: 44
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# PGDL task
### Submission deadline: March 30th, 2024
![Argmax](https://raw.githubusercontent.com/argmaxml/image-search/master/assets/argmax.png)

**Please watch [this explainer video](https://argmax.ml/pgdl).**

## Who is this repo for ?
[Argmax](https://www.argmaxml.com) is hiring Junior Data scientists.
This repo is meant to be a the first step in the process and it will set the stage for the interview.

The data is taken from a real-life scenario, and it reflects the type of work you will do at Argmax.

## About the position
We are a botique service company that specializes in recommendation systems and personalized-search.

Building a recommender system requires understanding various aspects of the user behaviour and the item properties. We utilize a variety of tools to do so, such as large-language models and vector databases.

An ideal candidate would be someone who is **proficient in python**, **curious** and able to do **independent research** when necessary.

This Github repo is designed to reflect some of the challenges you will encounter while working for Argmax.

Our offices located in Ramat-Gan, 42 Ben Gurion Rd. and we work Thursdays from there, the rest of the week we work from home or from clients' premises.

## Some videos from past projects

1. [Uri's talk on structured output with large language models](https://www.youtube.com/watch?v=0mDgjZMcW04)
1. [Benjamin Kempinski on offline metrics](https://www.youtube.com/watch?v=5OPa2RYL5VI)
1. [Daniel Hen & Uri Goren on pricing with contextual bandits](https://www.youtube.com/watch?v=IJtNBbINKbI)
1. [Eitan Zimmerman's talk on visual feed reranking](https://www.youtube.com/watch?v=q4uF8nF5SWk)

## Getting started with the task
### Setup
1. Set up Docker on your local machine
2. In a terimal, type `docker compose up`
3. Browse to [JupyterLab](http://localhost:8888)
4. Follow the instructions on the `sql.ipynb` notebook

### Submission:
1. Please clone this repo to a private repo on your github account.
1. Implement the missing parts.
1. Please fill in this [form](https://forms.gle/MaMtcL7yuKsbtgdk7).
1. An interview with Uri would be scheduled for you.

## The Interview process
### Hands-on Interview
1. An online hands-on interview would be scheduled during April 2024.
1. Be prepared to answer questions on your submission
1. This repo contains a lot of code, in the follow up interview you will be asked to extend a part of it

### On-Site Interview
1. After passing the online interview, you will be invited to the Argmax offices
2. The goal of the interview is non-technical, to get to know you and your aspirations
3. If everything goes well, you will get a contract circa end of April / Beginning of May.

### Best of luck with the task, Uri is available for questions on Linkedin

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/argmaxml/pgdl

Awesome Lists containing this project

README