+
+
+
diff --git a/docs/posts/2019-12-16-TensorFlow-Polynomial-Regression.html b/docs/posts/2019-12-16-TensorFlow-Polynomial-Regression.html
index 48c477a..60d6131 100644
--- a/docs/posts/2019-12-16-TensorFlow-Polynomial-Regression.html
+++ b/docs/posts/2019-12-16-TensorFlow-Polynomial-Regression.html
@@ -520,6 +520,9 @@ values using the X values. We then plot it to compare the actual data and predic
Basically if you train your machine learning model on a small dataset for a really large number of epochs, the model will learn all the deformities/noise in the data and will actually think that it is a normal part. Therefore when it will see some new data, it will discard that new data as noise and will impact the accuracy of the model in a negative manner
+
+
+
diff --git a/docs/posts/2019-12-22-Fake-News-Detector.html b/docs/posts/2019-12-22-Fake-News-Detector.html
index ea7fc41..2e1a76f 100644
--- a/docs/posts/2019-12-22-Fake-News-Detector.html
+++ b/docs/posts/2019-12-22-Fake-News-Detector.html
@@ -272,6 +272,9 @@ DescriptionThe bag-of-words model is a simplifying representation used in NLP, i
}
+
+
+
diff --git a/docs/posts/2020-01-14-Converting-between-PIL-NumPy.html b/docs/posts/2020-01-14-Converting-between-PIL-NumPy.html
index 0b88665..9fedcb5 100644
--- a/docs/posts/2020-01-14-Converting-between-PIL-NumPy.html
+++ b/docs/posts/2020-01-14-Converting-between-PIL-NumPy.html
@@ -62,6 +62,9 @@
img.save(destination,"JPEG",quality=80,optimize=True,progressive=True)
+
+
+
diff --git a/docs/posts/2020-01-15-Setting-up-Kaggle-to-use-with-Colab.html b/docs/posts/2020-01-15-Setting-up-Kaggle-to-use-with-Colab.html
index 5341ba7..ba58f68 100644
--- a/docs/posts/2020-01-15-Setting-up-Kaggle-to-use-with-Colab.html
+++ b/docs/posts/2020-01-15-Setting-up-Kaggle-to-use-with-Colab.html
@@ -82,6 +82,9 @@
The fact that there are steps on producing Vaporwave, this gave me the idea that Vaporwave can actually be made using programming, stay tuned for when I publish the program which I am working on ( Generating A E S T H E T I C artwork and remixes)
+
+
+
diff --git a/docs/posts/2020-04-13-Fixing-X11-Error-AmberTools-macOS.html b/docs/posts/2020-04-13-Fixing-X11-Error-AmberTools-macOS.html
index d540e3b..48eb451 100644
--- a/docs/posts/2020-04-13-Fixing-X11-Error-AmberTools-macOS.html
+++ b/docs/posts/2020-04-13-Fixing-X11-Error-AmberTools-macOS.html
@@ -71,6 +71,9 @@ Configure failed due to the errors above!
If you do not have XQuartz installed, you need to run brew cask install xquartz
This is just the docking command for AutoDock Vina. In the next part I will tell how to use PyMOL and a plugin to directly generate the coordinates in Vina format --center_x -9.7 --center_y 11.4 --center_z 68.9 --size_x 19.3 --size_y 29.9 --size_z 21.3 without needing to type them manually.
The package is available on my repository and only depends on boost. ( Both, Vina and Vina-Split are part of the package)
+
+
+
diff --git a/docs/posts/2020-07-01-Install-rdkit-colab.html b/docs/posts/2020-07-01-Install-rdkit-colab.html
index b65ece1..f4d7b5e 100644
--- a/docs/posts/2020-07-01-Install-rdkit-colab.html
+++ b/docs/posts/2020-07-01-Install-rdkit-colab.html
@@ -142,6 +142,9 @@
install()
+
+
+
diff --git a/docs/posts/2020-08-01-Natural-Feature-Tracking-ARJS.html b/docs/posts/2020-08-01-Natural-Feature-Tracking-ARJS.html
index 7322d04..3092dbd 100644
--- a/docs/posts/2020-08-01-Natural-Feature-Tracking-ARJS.html
+++ b/docs/posts/2020-08-01-Natural-Feature-Tracking-ARJS.html
@@ -317,6 +317,9 @@ Serving HTTP on 0.0.0.0 port 8000 ...
+
+
+
diff --git a/docs/posts/2020-10-11-macOS-Virtual-Cam-OBS.html b/docs/posts/2020-10-11-macOS-Virtual-Cam-OBS.html
index 7a24e04..3798da3 100644
--- a/docs/posts/2020-10-11-macOS-Virtual-Cam-OBS.html
+++ b/docs/posts/2020-10-11-macOS-Virtual-Cam-OBS.html
@@ -148,6 +148,9 @@ new Dics({
});
+
+
+
diff --git a/docs/posts/2020-11-17-Lets-Encrypt-DuckDns.html b/docs/posts/2020-11-17-Lets-Encrypt-DuckDns.html
index 699d6b6..8ffd418 100644
--- a/docs/posts/2020-11-17-Lets-Encrypt-DuckDns.html
+++ b/docs/posts/2020-11-17-Lets-Encrypt-DuckDns.html
@@ -108,6 +108,9 @@ navanspi.duckdns.org. 60 IN TXT
+
+
diff --git a/docs/posts/2020-12-1-HTML-JS-RSS-Feed.html b/docs/posts/2020-12-1-HTML-JS-RSS-Feed.html
index 206a463..d3fbee7 100644
--- a/docs/posts/2020-12-1-HTML-JS-RSS-Feed.html
+++ b/docs/posts/2020-12-1-HTML-JS-RSS-Feed.html
@@ -241,6 +241,9 @@
</body></html>
+
+
+
diff --git a/docs/posts/2021-06-25-Blog2Twitter-P1.html b/docs/posts/2021-06-25-Blog2Twitter-P1.html
index af7586a..cb0fc1c 100644
--- a/docs/posts/2021-06-25-Blog2Twitter-P1.html
+++ b/docs/posts/2021-06-25-Blog2Twitter-P1.html
@@ -135,6 +135,9 @@ I am not handling lists or images right now.
For the next part, I will try to append the code as well.
I actually added the code to this post after running the program.
+
+
+
diff --git a/docs/posts/2021-06-25-NFC-Music-Cards-Basic-iOS.html b/docs/posts/2021-06-25-NFC-Music-Cards-Basic-iOS.html
index 2c541d8..78b1ef5 100644
--- a/docs/posts/2021-06-25-NFC-Music-Cards-Basic-iOS.html
+++ b/docs/posts/2021-06-25-NFC-Music-Cards-Basic-iOS.html
@@ -69,6 +69,9 @@ So, I did not have to ensure this could work with any device. I settled with usi
+
+
+
diff --git a/docs/posts/2021-06-26-Cheminformatics-On-The-Web-2021.html b/docs/posts/2021-06-26-Cheminformatics-On-The-Web-2021.html
index 29ede6e..c62ffc3 100644
--- a/docs/posts/2021-06-26-Cheminformatics-On-The-Web-2021.html
+++ b/docs/posts/2021-06-26-Cheminformatics-On-The-Web-2021.html
@@ -122,6 +122,9 @@ Hopefully, this encourages you to explore the world of cheminformatics on the we
@article{chauhan_2019, title={Detecting Driver Fatigue, Over-Speeding, and Speeding up Post-Accident Response}, volume={6}, url={https://www.irjet.net/archives/V6/i5/IRJET-V6I5318.pdf}, number={5}, journal={International Research Journal of Engineering and Technology (IRJET)}, author={Chauhan, Navan}, year={2019}}
I recently came across a movie/tv-show recommender, couchmoney.tv. I loved it. I decided that I wanted to build something similar, so I could tinker with it as much as I wanted.
+
+
I also wanted a recommendation system I could use via a REST API. Although I have not included that part in this post, I did eventually create it.
+
+
How?
+
+
By measuring the cosine of the angle between two vectors, you can get a value in the range [0,1] with 0 meaning no similarity. Now, if we find a way to represent information about movies as a vector, we can use cosine similarity as a metric to find similar movies.
+
+
As we are recommending just based on the content of the movies, this is called a content based recommendation system.
+
+
Data Collection
+
+
Trakt exposes a nice API to search for movies/tv-shows. To access the API, you first need to get an API key (the Trakt ID you get when you create a new application).
+
+
I decided to use SQL-Alchemy with a SQLite backend just to make my life easier if I decided on switching to Postgres anytime I felt like.
+
+
First, I needed to check the total number of records in Trakt’s database.
In the end, I could have dropped the embeddings field from the table schema as I never got around to using it.
+
+
Scripting Time
+
+
fromdatabaseimport*
+fromtqdmimporttqdm
+importrequests
+importos
+
+trakt_id=os.getenv("TRAKT_ID")
+
+max_requests=5000# How many requests I wanted to wrap everything up in
+req_count=0# A counter for how many requests I have made
+
+years="1900-2021"
+page=1# The initial page number for the search
+extended="full"# Required to get additional information
+limit="10"# No of entires per request -- This will be automatically picked based on max_requests
+languages="en"# Limit to English
+
+api_base="https://api.trakt.tv"
+database_url="sqlite:///jlm.db"
+
+headers={
+ "Content-Type":"application/json",
+ "trakt-api-version":"2",
+ "trakt-api-key":trakt_id
+}
+
+params={
+ "query":"",
+ "years":years,
+ "page":page,
+ "extended":extended,
+ "limit":limit,
+ "languages":languages
+}
+
+# Helper function to get desirable values from the response
+defcreate_movie_dict(movie:dict):
+ m=movie["movie"]
+ movie_dict={
+ "title":m["title"],
+ "overview":m["overview"],
+ "genres":m["genres"],
+ "language":m["language"],
+ "year":int(m["year"]),
+ "trakt_id":m["ids"]["trakt"],
+ "released":m["released"],
+ "runtime":int(m["runtime"]),
+ "country":m["country"],
+ "rating":int(m["rating"]),
+ "votes":int(m["votes"]),
+ "comment_count":int(m["comment_count"]),
+ "tagline":m["tagline"]
+ }
+ returnmovie_dict
+
+# Get total number of items
+params["limit"]=1
+res=requests.get(f"{api_base}/search/movie",headers=headers,params=params)
+total_items=res.headers["x-pagination-item-count"]
+
+engine,Session=init_db_stuff(database_url)
+
+
+forpageintqdm(range(1,max_requests+1)):
+ params["page"]=page
+ params["limit"]=int(int(total_items)/max_requests)
+ movies=[]
+ res=requests.get(f"{api_base}/search/movie",headers=headers,params=params)
+
+ ifres.status_code==500:
+ break
+ elifres.status_code==200:
+ None
+ else:
+ print(f"OwO Code {res.status_code}")
+
+ formovieinres.json():
+ movies.append(create_movie_dict(movie))
+
+ withengine.connect()asconn:
+ formovieinmovies:
+ withconn.begin()astrans:
+ stmt=insert(movies_table).values(
+ trakt_id=movie["trakt_id"],title=movie["title"],genres=" ".join(movie["genres"]),
+ language=movie["language"],year=movie["year"],released=movie["released"],
+ runtime=movie["runtime"],country=movie["country"],overview=movie["overview"],
+ rating=movie["rating"],votes=movie["votes"],comment_count=movie["comment_count"],
+ tagline=movie["tagline"])
+ try:
+ result=conn.execute(stmt)
+ trans.commit()
+ exceptIntegrityError:
+ trans.rollback()
+ req_count+=1
+
+
+
(Note: I was well within the rate-limit so I did not have to slow down or implement any other measures)
+
+
Running this script took me approximately 3 hours, and resulted in an SQLite database of 141.5 MB
+
+
Embeddings!
+
+
I did not want to put my poor Mac through the estimated 23 hours it would have taken to embed the sentences. I decided to use Google Colab instead.
+
+
Because of the small size of the database file, I was able to just upload the file.
+
+
For the encoding model, I decided to use the pretrained paraphrase-multilingual-MiniLM-L12-v2 model for SentenceTransformers, a Python framework for SOTA sentence, text and image embeddings. I wanted to use a multilingual model as I personally consume content in various languages (natively, no dubs or subs) and some of the sources for their information do not translate to English. As of writing this post, I did not include any other database except Trakt.
+
+
While deciding how I was going to process the embeddings, I came across multiple solutions:
+
+
+
Milvus - An open-source vector database with similar search functionality
Pinecone - A fully managed vector database with similar search functionality
+
+
+
I did not want to waste time setting up the first two, so I decided to go with Pinecone which offers 1M 768-dim vectors for free with no credit card required (Our embeddings are 384-dim dense).
+
+
Getting started with Pinecone was as easy as:
+
+
+
Signing up
+
Specifying the index name and vector dimensions along with the similarity search metric (Cosine Similarity for our use case)
+
Getting the API key
+
Installing the Python module (pinecone-client)
+
+
+
importpandasaspd
+importpinecone
+fromsentence_transformersimportSentenceTransformer
+fromtqdmimporttqdm
+
+database_url="sqlite:///jlm.db"
+PINECONE_KEY="not-this-at-all"
+batch_size=32
+
+pinecone.init(api_key=PINECONE_KEY,environment="us-west1-gcp")
+index=pinecone.Index("movies")
+
+model=SentenceTransformer("paraphrase-multilingual-MiniLM-L12-v2",device="cuda")
+engine,Session=init_db_stuff(database_url)
+
+df=pd.read_sql("Select * from movies",engine)
+df["combined_text"]=df["title"]+": "+df["overview"].fillna('')+" - "+df["tagline"].fillna('')+" Genres:- "+df["genres"].fillna('')
+
+# Creating the embedding and inserting it into the database
+forxintqdm(range(0,len(df),batch_size)):
+ to_send=[]
+ trakt_ids=df["trakt_id"][x:x+batch_size].tolist()
+ sentences=df["combined_text"][x:x+batch_size].tolist()
+ embeddings=model.encode(sentences)
+ foridx,valueinenumerate(trakt_ids):
+ to_send.append(
+ (
+ str(value),embeddings[idx].tolist()
+ ))
+ index.upsert(to_send)
+
+
+
That's it!
+
+
Interacting with Vectors
+
+
We use the trakt_id for the movie as the ID for the vectors and upsert it into the index.
+
+
To find similar items, we will first have to map the name of the movie to its trakt_id, get the embeddings we have for that id and then perform a similarity search. It is possible that this additional step of mapping could be avoided by storing information as metadata in the index.
movie_name="Now You See Me"
+
+movie_trakt_id=get_trakt_id(df,movie_name)
+print(movie_trakt_id)
+movie_vector=get_vector_value(movie_trakt_id)
+movie_queries=query_vectors(movie_vector)
+movie_ids=query2ids(movie_queries)
+print(movie_ids)
+
+fortrakt_idinmovie_ids:
+ deets=get_deets_by_trakt_id(df,trakt_id)
+ print(f"{deets['title']} ({deets['year']}): {deets['overview']}")
+
+
+
Output:
+
+
[55786, 18374, 299592, 662622, 6054, 227458, 139687, 303950, 70000, 129307, 70823, 5766, 23950, 137696, 655723, 32842, 413269, 145994, 197990, 373832]
+Now You See Me (2013): An FBI agent and an Interpol detective track a team of illusionists who pull off bank heists during their performances and reward their audiences with the money.
+Trapped (1949): U.S. Treasury Department agents go after a ring of counterfeiters.
+Brute Sanity (2018): An FBI-trained neuropsychologist teams up with a thief to find a reality-altering device while her insane ex-boss unleashes bizarre traps to stop her.
+The Chase (2017): Some FBI agents hunt down a criminal
+Surveillance (2008): An FBI agent tracks a serial killer with the help of three of his would-be victims - all of whom have wildly different stories to tell.
+Marauders (2016): An untraceable group of elite bank robbers is chased by a suicidal FBI agent who uncovers a deeper purpose behind the robbery-homicides.
+Miracles for Sale (1939): A maker of illusions for magicians protects an ingenue likely to be murdered.
+Deceptors (2005): A Ghostbusters knock-off where a group of con-artists create bogus monsters to scare up some cash. They run for their lives when real spooks attack.
+The Outfit (1993): A renegade FBI agent sparks an explosive mob war between gangster crime lords Legs Diamond and Dutch Schultz.
+Bank Alarm (1937): A federal agent learns the gangsters he's been investigating have kidnapped his sister.
+The Courier (2012): A shady FBI agent recruits a courier to deliver a mysterious package to a vengeful master criminal who has recently resurfaced with a diabolical plan.
+After the Sunset (2004): An FBI agent is suspicious of two master thieves, quietly enjoying their retirement near what may - or may not - be the biggest score of their careers.
+Down Three Dark Streets (1954): An FBI Agent takes on the three unrelated cases of a dead agent to track down his killer.
+The Executioner (1970): A British intelligence agent must track down a fellow spy suspected of being a double agent.
+Ace of Cactus Range (1924): A Secret Service agent goes undercover to unmask the leader of a gang of diamond thieves.
+Firepower (1979): A mercenary is hired by the FBI to track down a powerful recluse criminal, a woman is also trying to track him down for her own personal vendetta.
+Heroes & Villains (2018): an FBI agent chases a thug to great tunes
+Federal Fugitives (1941): A government agent goes undercover in order to apprehend a saboteur who caused a plane crash.
+Hell on Earth (2012): An FBI Agent on the trail of a group of drug traffickers learns that their corruption runs deeper than she ever imagined, and finds herself in a supernatural - and deadly - situation.
+Spies (2015): A secret agent must perform a heist without time on his side
+
Tags:
- Experiment,
+ Experiment
diff --git a/docs/posts/2022-05-21-Similar-Movies-Recommender.html b/docs/posts/2022-05-21-Similar-Movies-Recommender.html
new file mode 100644
index 0000000..42b887a
--- /dev/null
+++ b/docs/posts/2022-05-21-Similar-Movies-Recommender.html
@@ -0,0 +1,438 @@
+
+
+
+
+
+
+
+
+ Hey - Post - Building a Simple Similar Movies Recommender System
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Building a Simple Similar Movies Recommender System
+
+
Why?
+
+
I recently came across a movie/tv-show recommender, couchmoney.tv. I loved it. I decided that I wanted to build something similar, so I could tinker with it as much as I wanted.
+
+
I also wanted a recommendation system I could use via a REST API. Although I have not included that part in this post, I did eventually create it.
+
+
How?
+
+
By measuring the cosine of the angle between two vectors, you can get a value in the range [0,1] with 0 meaning no similarity. Now, if we find a way to represent information about movies as a vector, we can use cosine similarity as a metric to find similar movies.
+
+
As we are recommending just based on the content of the movies, this is called a content based recommendation system.
+
+
Data Collection
+
+
Trakt exposes a nice API to search for movies/tv-shows. To access the API, you first need to get an API key (the Trakt ID you get when you create a new application).
+
+
I decided to use SQL-Alchemy with a SQLite backend just to make my life easier if I decided on switching to Postgres anytime I felt like.
+
+
First, I needed to check the total number of records in Trakt’s database.
In the end, I could have dropped the embeddings field from the table schema as I never got around to using it.
+
+
Scripting Time
+
+
fromdatabaseimport*
+fromtqdmimporttqdm
+importrequests
+importos
+
+trakt_id=os.getenv("TRAKT_ID")
+
+max_requests=5000# How many requests I wanted to wrap everything up in
+req_count=0# A counter for how many requests I have made
+
+years="1900-2021"
+page=1# The initial page number for the search
+extended="full"# Required to get additional information
+limit="10"# No of entires per request -- This will be automatically picked based on max_requests
+languages="en"# Limit to English
+
+api_base="https://api.trakt.tv"
+database_url="sqlite:///jlm.db"
+
+headers={
+ "Content-Type":"application/json",
+ "trakt-api-version":"2",
+ "trakt-api-key":trakt_id
+}
+
+params={
+ "query":"",
+ "years":years,
+ "page":page,
+ "extended":extended,
+ "limit":limit,
+ "languages":languages
+}
+
+# Helper function to get desirable values from the response
+defcreate_movie_dict(movie:dict):
+ m=movie["movie"]
+ movie_dict={
+ "title":m["title"],
+ "overview":m["overview"],
+ "genres":m["genres"],
+ "language":m["language"],
+ "year":int(m["year"]),
+ "trakt_id":m["ids"]["trakt"],
+ "released":m["released"],
+ "runtime":int(m["runtime"]),
+ "country":m["country"],
+ "rating":int(m["rating"]),
+ "votes":int(m["votes"]),
+ "comment_count":int(m["comment_count"]),
+ "tagline":m["tagline"]
+ }
+ returnmovie_dict
+
+# Get total number of items
+params["limit"]=1
+res=requests.get(f"{api_base}/search/movie",headers=headers,params=params)
+total_items=res.headers["x-pagination-item-count"]
+
+engine,Session=init_db_stuff(database_url)
+
+
+forpageintqdm(range(1,max_requests+1)):
+ params["page"]=page
+ params["limit"]=int(int(total_items)/max_requests)
+ movies=[]
+ res=requests.get(f"{api_base}/search/movie",headers=headers,params=params)
+
+ ifres.status_code==500:
+ break
+ elifres.status_code==200:
+ None
+ else:
+ print(f"OwO Code {res.status_code}")
+
+ formovieinres.json():
+ movies.append(create_movie_dict(movie))
+
+ withengine.connect()asconn:
+ formovieinmovies:
+ withconn.begin()astrans:
+ stmt=insert(movies_table).values(
+ trakt_id=movie["trakt_id"],title=movie["title"],genres=" ".join(movie["genres"]),
+ language=movie["language"],year=movie["year"],released=movie["released"],
+ runtime=movie["runtime"],country=movie["country"],overview=movie["overview"],
+ rating=movie["rating"],votes=movie["votes"],comment_count=movie["comment_count"],
+ tagline=movie["tagline"])
+ try:
+ result=conn.execute(stmt)
+ trans.commit()
+ exceptIntegrityError:
+ trans.rollback()
+ req_count+=1
+
+
+
(Note: I was well within the rate-limit so I did not have to slow down or implement any other measures)
+
+
Running this script took me approximately 3 hours, and resulted in an SQLite database of 141.5 MB
+
+
Embeddings!
+
+
I did not want to put my poor Mac through the estimated 23 hours it would have taken to embed the sentences. I decided to use Google Colab instead.
+
+
Because of the small size of the database file, I was able to just upload the file.
+
+
For the encoding model, I decided to use the pretrained paraphrase-multilingual-MiniLM-L12-v2 model for SentenceTransformers, a Python framework for SOTA sentence, text and image embeddings. I wanted to use a multilingual model as I personally consume content in various languages (natively, no dubs or subs) and some of the sources for their information do not translate to English. As of writing this post, I did not include any other database except Trakt.
+
+
While deciding how I was going to process the embeddings, I came across multiple solutions:
+
+
+
Milvus - An open-source vector database with similar search functionality
Pinecone - A fully managed vector database with similar search functionality
+
+
+
I did not want to waste time setting up the first two, so I decided to go with Pinecone which offers 1M 768-dim vectors for free with no credit card required (Our embeddings are 384-dim dense).
+
+
Getting started with Pinecone was as easy as:
+
+
+
Signing up
+
Specifying the index name and vector dimensions along with the similarity search metric (Cosine Similarity for our use case)
+
Getting the API key
+
Installing the Python module (pinecone-client)
+
+
+
importpandasaspd
+importpinecone
+fromsentence_transformersimportSentenceTransformer
+fromtqdmimporttqdm
+
+database_url="sqlite:///jlm.db"
+PINECONE_KEY="not-this-at-all"
+batch_size=32
+
+pinecone.init(api_key=PINECONE_KEY,environment="us-west1-gcp")
+index=pinecone.Index("movies")
+
+model=SentenceTransformer("paraphrase-multilingual-MiniLM-L12-v2",device="cuda")
+engine,Session=init_db_stuff(database_url)
+
+df=pd.read_sql("Select * from movies",engine)
+df["combined_text"]=df["title"]+": "+df["overview"].fillna('')+" - "+df["tagline"].fillna('')+" Genres:- "+df["genres"].fillna('')
+
+# Creating the embedding and inserting it into the database
+forxintqdm(range(0,len(df),batch_size)):
+ to_send=[]
+ trakt_ids=df["trakt_id"][x:x+batch_size].tolist()
+ sentences=df["combined_text"][x:x+batch_size].tolist()
+ embeddings=model.encode(sentences)
+ foridx,valueinenumerate(trakt_ids):
+ to_send.append(
+ (
+ str(value),embeddings[idx].tolist()
+ ))
+ index.upsert(to_send)
+
+
+
That's it!
+
+
Interacting with Vectors
+
+
We use the trakt_id for the movie as the ID for the vectors and upsert it into the index.
+
+
To find similar items, we will first have to map the name of the movie to its trakt_id, get the embeddings we have for that id and then perform a similarity search. It is possible that this additional step of mapping could be avoided by storing information as metadata in the index.
movie_name="Now You See Me"
+
+movie_trakt_id=get_trakt_id(df,movie_name)
+print(movie_trakt_id)
+movie_vector=get_vector_value(movie_trakt_id)
+movie_queries=query_vectors(movie_vector)
+movie_ids=query2ids(movie_queries)
+print(movie_ids)
+
+fortrakt_idinmovie_ids:
+ deets=get_deets_by_trakt_id(df,trakt_id)
+ print(f"{deets['title']} ({deets['year']}): {deets['overview']}")
+
+
+
Output:
+
+
[55786, 18374, 299592, 662622, 6054, 227458, 139687, 303950, 70000, 129307, 70823, 5766, 23950, 137696, 655723, 32842, 413269, 145994, 197990, 373832]
+Now You See Me (2013): An FBI agent and an Interpol detective track a team of illusionists who pull off bank heists during their performances and reward their audiences with the money.
+Trapped (1949): U.S. Treasury Department agents go after a ring of counterfeiters.
+Brute Sanity (2018): An FBI-trained neuropsychologist teams up with a thief to find a reality-altering device while her insane ex-boss unleashes bizarre traps to stop her.
+The Chase (2017): Some FBI agents hunt down a criminal
+Surveillance (2008): An FBI agent tracks a serial killer with the help of three of his would-be victims - all of whom have wildly different stories to tell.
+Marauders (2016): An untraceable group of elite bank robbers is chased by a suicidal FBI agent who uncovers a deeper purpose behind the robbery-homicides.
+Miracles for Sale (1939): A maker of illusions for magicians protects an ingenue likely to be murdered.
+Deceptors (2005): A Ghostbusters knock-off where a group of con-artists create bogus monsters to scare up some cash. They run for their lives when real spooks attack.
+The Outfit (1993): A renegade FBI agent sparks an explosive mob war between gangster crime lords Legs Diamond and Dutch Schultz.
+Bank Alarm (1937): A federal agent learns the gangsters he's been investigating have kidnapped his sister.
+The Courier (2012): A shady FBI agent recruits a courier to deliver a mysterious package to a vengeful master criminal who has recently resurfaced with a diabolical plan.
+After the Sunset (2004): An FBI agent is suspicious of two master thieves, quietly enjoying their retirement near what may - or may not - be the biggest score of their careers.
+Down Three Dark Streets (1954): An FBI Agent takes on the three unrelated cases of a dead agent to track down his killer.
+The Executioner (1970): A British intelligence agent must track down a fellow spy suspected of being a double agent.
+Ace of Cactus Range (1924): A Secret Service agent goes undercover to unmask the leader of a gang of diamond thieves.
+Firepower (1979): A mercenary is hired by the FBI to track down a powerful recluse criminal, a woman is also trying to track him down for her own personal vendetta.
+Heroes & Villains (2018): an FBI agent chases a thug to great tunes
+Federal Fugitives (1941): A government agent goes undercover in order to apprehend a saboteur who caused a plane crash.
+Hell on Earth (2012): An FBI Agent on the trail of a group of drug traffickers learns that their corruption runs deeper than she ever imagined, and finds herself in a supernatural - and deadly - situation.
+Spies (2015): A secret agent must perform a heist without time on his side
+
Writing a simple Machine-Learning powered Chatbot (or, daresay virtual personal assistant ) in Swift using CoreML.
--
cgit v1.2.3
From d96373570c7110e0b9e872b494a79c855b18b686 Mon Sep 17 00:00:00 2001
From: navanchauhan
Date: Sun, 22 May 2022 12:03:37 -0600
Subject: added new post movie recommender
---
docs/assets/flixrec/filter.png | Bin 0 -> 242231 bytes
docs/assets/flixrec/home.png | Bin 0 -> 160255 bytes
docs/assets/flixrec/multiple.png | Bin 0 -> 251294 bytes
docs/assets/flixrec/results.png | Bin 0 -> 280362 bytes
4 files changed, 0 insertions(+), 0 deletions(-)
create mode 100644 docs/assets/flixrec/filter.png
create mode 100644 docs/assets/flixrec/home.png
create mode 100644 docs/assets/flixrec/multiple.png
create mode 100644 docs/assets/flixrec/results.png
(limited to 'docs')
diff --git a/docs/assets/flixrec/filter.png b/docs/assets/flixrec/filter.png
new file mode 100644
index 0000000..c1e4c52
Binary files /dev/null and b/docs/assets/flixrec/filter.png differ
diff --git a/docs/assets/flixrec/home.png b/docs/assets/flixrec/home.png
new file mode 100644
index 0000000..2d6fb51
Binary files /dev/null and b/docs/assets/flixrec/home.png differ
diff --git a/docs/assets/flixrec/multiple.png b/docs/assets/flixrec/multiple.png
new file mode 100644
index 0000000..f35d342
Binary files /dev/null and b/docs/assets/flixrec/multiple.png differ
diff --git a/docs/assets/flixrec/results.png b/docs/assets/flixrec/results.png
new file mode 100644
index 0000000..a239ba4
Binary files /dev/null and b/docs/assets/flixrec/results.png differ
--
cgit v1.2.3
From 1ca3a0abf3b1dad33ce5d5859253220e8b2205d1 Mon Sep 17 00:00:00 2001
From: navanchauhan
Date: Sun, 22 May 2022 12:13:58 -0600
Subject: change tags
---
docs/feed.rss | 4 ++--
docs/index.html | 4 +---
docs/posts/2022-05-21-Similar-Movies-Recommender.html | 5 +++++
docs/posts/index.html | 4 +---
4 files changed, 9 insertions(+), 8 deletions(-)
(limited to 'docs')
diff --git a/docs/feed.rss b/docs/feed.rss
index 3f65a70..4ea732c 100644
--- a/docs/feed.rss
+++ b/docs/feed.rss
@@ -4,8 +4,8 @@
Navan's ArchiveRare Tips, Tricks and Posts
https://web.navan.dev/en
- Sun, 22 May 2022 11:59:10 -0000
- Sun, 22 May 2022 11:59:10 -0000
+ Sun, 22 May 2022 12:13:40 -0000
+ Sun, 22 May 2022 12:13:40 -0000250
diff --git a/docs/index.html b/docs/index.html
index e2ccc9f..5d505c1 100644
--- a/docs/index.html
+++ b/docs/index.html
@@ -57,9 +57,7 @@
Transformers,
- Movies,
-
- Recommender-System
+ Recommendation-System
diff --git a/docs/posts/2022-05-21-Similar-Movies-Recommender.html b/docs/posts/2022-05-21-Similar-Movies-Recommender.html
index 42b887a..1b105b9 100644
--- a/docs/posts/2022-05-21-Similar-Movies-Recommender.html
+++ b/docs/posts/2022-05-21-Similar-Movies-Recommender.html
@@ -25,6 +25,8 @@
+
+
@@ -429,6 +431,9 @@ Spies (2015): A secret agent must perform a heist without time on his side
Filter based on popularity: The data already exists in the indexed database
- Building a Simple Similar Movies Recommender System
+ Building a Similar Movies Recommendation System
- Building a Content Based Similar Movies Recommender System
+ Building a Content Based Similar Movies Recommendatiom System
https://web.navan.dev/posts/2022-05-21-Similar-Movies-Recommender.html
Sat, 21 May 2022 17:56:00 -0000
- Building a Simple Similar Movies Recommender System
+ Building a Similar Movies Recommendation System
Building a Content Based Similar Movies Recommender System
+
Building a Content Based Similar Movies Recommendatiom System
Published On: 2022-05-21 17:56
Tags:
diff --git a/docs/posts/2022-05-21-Similar-Movies-Recommender.html b/docs/posts/2022-05-21-Similar-Movies-Recommender.html
index 1b105b9..2c0b488 100644
--- a/docs/posts/2022-05-21-Similar-Movies-Recommender.html
+++ b/docs/posts/2022-05-21-Similar-Movies-Recommender.html
@@ -6,17 +6,17 @@
- Hey - Post - Building a Simple Similar Movies Recommender System
+ Hey - Post - Building a Similar Movies Recommendation System
-
-
-
-
-
-
+
+
+
+
+
+
@@ -41,7 +41,7 @@
-
Building a Simple Similar Movies Recommender System