importosimporttweepyconsumer_key=os.environ["consumer_key"]
@@ -96,13 +97,15 @@ I am not handling lists or images right now.
auth.set_access_token(access_token,access_token_secret)api=tweepy.API(auth)
-
+
+
The program need to convert the blog post into text fragments.
It reads the markdown file, removes the top YAML content, checks for headers and splits the content.
-
tweets=[]
+
+
tweets=[]first___n=0
@@ -129,13 +132,15 @@ I am not handling lists or images right now.
print("ERROR")else:tweets.append(line)
-
+
+
Every status update using tweepy has an id attached to it, for the next tweet in the thread, it adds that ID while calling the function.
For every tweet fragment, it also appends 1/n.
-
foridx,tweetinenumerate(tweets):
+
+
foridx,tweetinenumerate(tweets):tweet+=" {}/{}".format(idx+1,len(tweets))ifidx==0:a=None
@@ -144,12 +149,15 @@ I am not handling lists or images right now.
a=api.update_status(tweet,in_reply_to_status_id=a.id)print(len(tweet),end=" ")print("{}/{}\n".format(idx+1,len(tweets)))
-
+
+
Finally, it replies to the last tweet in the thread with the link of the post.
]]>
@@ -278,7 +290,8 @@ I actually added the code to this post after running the program.
Imports
-
%tensorflow_version2.x#This is for telling Colab that you want to use TF 2.0, ignore if running on local machine
+
+
%tensorflow_version2.x#This is for telling Colab that you want to use TF 2.0, ignore if running on local machinefromPILimportImage# We use the PIL Library to resize imagesimportnumpyasnp
@@ -290,21 +303,25 @@ I actually added the code to this post after running the program.
importmatplotlib.pyplotaspltfromkeras.modelsimportSequentialfromkeras.layersimportConv2D,MaxPooling2D,Dense,Flatten,Dropout
-
We resize all the images as 50x50 and add the numpy array of that image as well as their label names (Infected or Not) to common arrays.
-
data=[]
+
+
data=[]labels=[]Parasitized=os.listdir("./cell_images/Parasitized/")
@@ -328,15 +345,18 @@ I actually added the code to this post after running the program.
labels.append(1)exceptAttributeError:print("")
-
We use the Adam optimiser as it is an adaptive learning rate optimisation algorithm that's been designed specifically for training deep neural networks, which means it changes its learning rate automatically to get the best results
(Note: I was well within the rate-limit so I did not have to slow down or implement any other measures)
@@ -799,7 +837,8 @@ As of writing this post, I did not include any other database except Trakt.
Installing the Python module (pinecone-client)
-
importpandasaspd
+
+
importpandasaspdimportpineconefromsentence_transformersimportSentenceTransformerfromtqdmimporttqdm
@@ -829,7 +868,8 @@ As of writing this post, I did not include any other database except Trakt.
str(value),embeddings[idx].tolist()))index.upsert(to_send)
-
+
+
That's it!
@@ -840,7 +880,8 @@ As of writing this post, I did not include any other database except Trakt.
To find similar items, we will first have to map the name of the movie to its trakt_id, get the embeddings we have for that id and then perform a similarity search.
It is possible that this additional step of mapping could be avoided by storing information as metadata in the index.
-
defget_trakt_id(df,title:str):
+
+
defget_trakt_id(df,title:str):rec=df[df["title"].str.lower()==movie_name.lower()]iflen(rec.trakt_id.values.tolist())>1:print(f"multiple values found... {len(rec.trakt_id.values)}")
@@ -880,11 +921,13 @@ It is possible that this additional step of mapping could be avoided by storing
"runtime":df.runtime.values[0],"year":df.year.values[0]}
-
+
+
Testing it Out
-
movie_name="Now You See Me"
+
+
movie_name="Now You See Me"movie_trakt_id=get_trakt_id(df,movie_name)print(movie_trakt_id)
@@ -896,7 +939,8 @@ It is possible that this additional step of mapping could be avoided by storing
fortrakt_idinmovie_ids:deets=get_deets_by_trakt_id(df,trakt_id)print(f"{deets['title']} ({deets['year']}): {deets['overview']}")
-
+
+
Output:
@@ -1128,7 +1172,8 @@ me.fset me.fset3 me.iset
Create a new file called index.html in your project folder. This is the basic template we are going to use. Replace me with the root filename of your image, for example NeverGonnaGiveYouUp.png will become NeverGonnaGiveYouUp. Make sure you have copied all three files from the output folder in the previous step to the root of your project folder.
In this we are creating a AFrame scene and we are telling it that we want to use NFT Tracking. The amazing part about using AFrame is that we are able to use all AFrame objects!
@@ -1264,6 +1316,52 @@ Serving HTTP on 0.0.0.0 port 8000 ...
]]>
+
+
+ https://web.navan.dev/posts/2022-11-07-a-new-method-to-blog.html
+
+
+ A new method to blog
+
+
+ Writing posts in markdown using pen and paper
+
+ https://web.navan.dev/posts/2022-11-07-a-new-method-to-blog.html
+ Mon, 07 Nov 2022 23:29:00 -0000
+ A new method to blog
+
+
Paper Website is a service that lets you build a website with just pen and paper. I am going to try and replicate the process.
+
+
The Plan
+
+
The continuity feature on macOS + iOS lets you scan PDFs directly from your iPhone. I want to be able to scan these pages and automatically run an Automator script that takes the PDF and OCRs the text. Then I can further clean the text and convert from markdown.
+
+
Challenges
+
+
I quickly realised that the OCR software I planned on using could not detect my shitty handwriting accurately. I tried using ABBY Finereader, Prizmo and OCRMyPDF. (Abby Finereader and Prizmo support being automated by Automator).
+
+
Now, I could either write neater, or use an external API like Microsoft Azure
+
+
Solution
+
+
OCR
+
+
In the PDFs, all the scans are saved as images on a page. I extract the image and then send it to Azure's API.
+
+
Paragraph Breaks
+
+
The recognised text had multiple lines breaking in the middle of the sentence, Therefore, I use what is called a pilcrow to specify paragraph breaks. But, rather than trying to draw the normal pilcrow, I just use the HTML entity ¶ which is the pilcrow character.
+
+
Where is the code?
+
+
I created a GitHub Gist for a sample Python script to take the PDF and print the text
+
+
A more complete version with Auomator scripts and an entire publishing pipeline will be available as a GitHub and Gitea repo soon.
+
+
* In Part 2, I will discuss some more features *
+]]>
+
+
https://web.navan.dev/posts/2020-03-03-Playing-With-Android-TV.html
@@ -1391,12 +1489,14 @@ Polynomial regression even fits a non-linear relationship (e.g when the points d
df# this gives us a preview of the dataset we are working with
-
+
+
df# this gives us a preview of the dataset we are working with
+
+
-
|Position|Level|Salary|
+
+
|Position|Level|Salary||-------------------|-------|---------||BusinessAnalyst|1|45000||JuniorConsultant|2|50000|
@@ -1443,81 +1554,100 @@ Polynomial regression even fits a non-linear relationship (e.g when the points d
|SeniorPartner|8|300000||C-level|9|500000||CEO|10|1000000|
-
+
+
We convert the salary column as the ordinate (y-coordinate) and level column as the abscissa
@@ -1526,7 +1656,8 @@ values using the X values. We then plot it to compare the actual data and predic
Linear Equation
-
withtf.Session()assess:
+
+
withtf.Session()assess:sess.run(init)forepochinrange(no_of_epochs):for(x,y)inzip(abscissa,ordinate):
@@ -1540,9 +1671,11 @@ values using the X values. We then plot it to compare the actual data and predic
constant=sess.run(b)print(training_cost,coefficient1,constant)
-
Epoch 1000 :TrainingCost:88999125000.0a,b:180396.42-478869.12Epoch 2000 :TrainingCost:88999125000.0a,b:180396.42-478869.12Epoch 3000 :TrainingCost:88999125000.0a,b:180396.42-478869.12Epoch 4000 :TrainingCost:88999125000.0a,b:180396.42-478869.12
@@ -1568,9 +1701,11 @@ values using the X values. We then plot it to compare the actual data and predic
Epoch 24000 :TrainingCost:88999125000.0a,b:180396.42-478869.12Epoch 25000 :TrainingCost:88999125000.0a,b:180396.42-478869.1288999125000.0 180396.42 -478869.12
-
+
+
-
predictions=[]
+
+
predictions=[]forxinabscissa:predictions.append((coefficient1*x+constant))plt.plot(abscissa,ordinate,'ro',label='Original data')
@@ -1578,13 +1713,15 @@ values using the X values. We then plot it to compare the actual data and predic
plt.title('Linear Regression Result')plt.legend()plt.show()
-
+
+
Quadratic Equation
-
withtf.Session()assess:
+
+
withtf.Session()assess:sess.run(init)forepochinrange(no_of_epochs):for(x,y)inzip(abscissa,ordinate):
@@ -1599,9 +1736,11 @@ values using the X values. We then plot it to compare the actual data and predic
constant=sess.run(c)print(training_cost,coefficient1,coefficient2,constant)
-
Epoch 1000 :TrainingCost:52571360000.0a,b,c:1002.44561097.01971276.6921Epoch 2000 :TrainingCost:37798890000.0a,b,c:1952.42632130.28252469.7756Epoch 3000 :TrainingCost:26751185000.0a,b,c:2839.58253081.61183554.351Epoch 4000 :TrainingCost:19020106000.0a,b,c:3644.563922.95634486.3135
@@ -1627,9 +1766,11 @@ values using the X values. We then plot it to compare the actual data and predic
Epoch 24000 :TrainingCost:8088001000.0a,b,c:6632.963399.878-79.89219Epoch 25000 :TrainingCost:8058094600.0a,b,c:6659.7933227.2517-463.031568058094600.0 6659.793 3227.2517 -463.03156
-
+
+
-
predictions=[]
+
+
predictions=[]forxinabscissa:predictions.append((coefficient1*pow(x,2)+coefficient2*x+constant))plt.plot(abscissa,ordinate,'ro',label='Original data')
@@ -1637,13 +1778,15 @@ values using the X values. We then plot it to compare the actual data and predic
plt.title('Quadratic Regression Result')plt.legend()plt.show()
-
+
+
Cubic
-
withtf.Session()assess:
+
+
withtf.Session()assess:sess.run(init)forepochinrange(no_of_epochs):for(x,y)inzip(abscissa,ordinate):
@@ -1659,9 +1802,11 @@ values using the X values. We then plot it to compare the actual data and predic
constant=sess.run(d)print(training_cost,coefficient1,coefficient2,coefficient3,constant)
-
Epoch 1000 :TrainingCost:4279814000.0a,b,c,d:670.1527694.4212751.4653903.9527Epoch 2000 :TrainingCost:3770950400.0a,b,c,d:742.6414666.3489636.94525859.2088Epoch 3000 :TrainingCost:3717708300.0a,b,c,d:756.2582569.3339448.105748.23956Epoch 4000 :TrainingCost:3667464000.0a,b,c,d:769.4476474.0318265.5761654.75525
@@ -1687,9 +1832,11 @@ values using the X values. We then plot it to compare the actual data and predic
Epoch 24000 :TrainingCost:3070361300.0a,b,c,d:975.52875-1095.4292-2211.8541847.4485Epoch 25000 :TrainingCost:3052791300.0a,b,c,d:983.4346-1159.7922-2286.94122027.48573052791300.0 983.4346 -1159.7922 -2286.9412 2027.4857
-
+
+
-
predictions=[]
+
+
predictions=[]forxinabscissa:predictions.append((coefficient1*pow(x,3)+coefficient2*pow(x,2)+coefficient3*x+constant))plt.plot(abscissa,ordinate,'ro',label='Original data')
@@ -1697,13 +1844,15 @@ values using the X values. We then plot it to compare the actual data and predic
plt.title('Cubic Regression Result')plt.legend()plt.show()
-
+
+
Quartic
-
withtf.Session()assess:
+
+
withtf.Session()assess:sess.run(init)forepochinrange(no_of_epochs):for(x,y)inzip(abscissa,ordinate):
@@ -1720,9 +1869,11 @@ values using the X values. We then plot it to compare the actual data and predic
constant=sess.run(e)print(training_cost,coefficient1,coefficient2,coefficient3,coefficient4,constant)
-
Epoch 1000 :TrainingCost:1902632600.0a,b,c,d:84.4830452.21059454.791424142.51952512.0343Epoch 2000 :TrainingCost:1854316200.0a,b,c,d:88.99895513.07355714.276088223.556671056.4655Epoch 3000 :TrainingCost:1812812400.0a,b,c,d:92.9462-22.331177-15.262934327.418581634.9054Epoch 4000 :TrainingCost:1775716000.0a,b,c,d:96.42522-54.64535-35.829437449.50282239.1392
@@ -1748,9 +1899,11 @@ values using the X values. We then plot it to compare the actual data and predic
Epoch 24000 :TrainingCost:1252052600.0a,b,c,d:135.9583-493.3825490.2686163764.007815010.481Epoch 25000 :TrainingCost:1231713700.0a,b,c,d:137.54753-512.1876101.593723926.489715609.3681231713700.0 137.54753 -512.1876 101.59372 3926.4897 15609.368
-
+
+
-
predictions=[]
+
+
predictions=[]forxinabscissa:predictions.append((coefficient1*pow(x,4)+coefficient2*pow(x,3)+coefficient3*pow(x,2)+coefficient4*x+constant))plt.plot(abscissa,ordinate,'ro',label='Original data')
@@ -1758,13 +1911,15 @@ values using the X values. We then plot it to compare the actual data and predic
plt.title('Quartic Regression Result')plt.legend()plt.show()
-
+
+
Quintic
-
withtf.Session()assess:
+
+
withtf.Session()assess:sess.run(init)forepochinrange(no_of_epochs):for(x,y)inzip(abscissa,ordinate):
@@ -1780,9 +1935,11 @@ values using the X values. We then plot it to compare the actual data and predic
coefficient4=sess.run(d)coefficient5=sess.run(e)constant=sess.run(f)
-
Epoch 1000 :TrainingCost:1409200100.0a,b,c,d,e,f:7.9494727.4621955.626034184.29028484.002231024.0083Epoch 2000 :TrainingCost:1306882400.0a,b,c,d,e,f:8.732181-4.008589773.25298315.90103904.088872004.9749Epoch 3000 :TrainingCost:1212606000.0a,b,c,d,e,f:9.732249-16.9012586.28379437.065521305.0552966.2188Epoch 4000 :TrainingCost:1123640400.0a,b,c,d,e,f:10.74851-29.8269298.59997555.3311698.46313917.9155
@@ -1808,9 +1965,11 @@ values using the X values. We then plot it to compare the actual data and predic
Epoch 24000 :TrainingCost:229660080.0a,b,c,d,e,f:27.102589-238.44817309.353422420.41857770.572819536.19Epoch 25000 :TrainingCost:216972400.0a,b,c,d,e,f:27.660324-245.69016318.100622483.36087957.35420027.707216972400.0 27.660324 -245.69016 318.10062 2483.3608 7957.354 20027.707
-
+
+
-
predictions=[]
+
+
predictions=[]forxinabscissa:predictions.append((coefficient1*pow(x,5)+coefficient2*pow(x,4)+coefficient3*pow(x,3)+coefficient4*pow(x,2)+coefficient5*x+constant))plt.plot(abscissa,ordinate,'ro',label='Original data')
@@ -1818,7 +1977,8 @@ values using the X values. We then plot it to compare the actual data and predic
plt.title('Quintic Regression Result')plt.legend()plt.show()
-
+
+
@@ -2682,7 +2842,8 @@ I created a sample JSON with only 3 examples (I know, very less, but works for a
-
[
+
+
[{"tokens":["Tell","me","about","the","drug","Aspirin","."],"labels":["NONE","NONE","NONE","NONE","NONE","COMPOUND","NONE"]
@@ -2696,7 +2857,8 @@ I created a sample JSON with only 3 examples (I know, very less, but works for a
"labels":["NONE","NONE","NONE","NONE","COMPOUND","NONE","NONE"]}]
-
+
+
@@ -2706,7 +2868,8 @@ I created a sample JSON with only 3 examples (I know, very less, but works for a
-
importCoreML
+
+
importCoreMLimportNaturalLanguageletmlModelClassifier=tryIntentDetection_1(configuration:MLModelConfiguration()).model
@@ -2717,7 +2880,8 @@ I created a sample JSON with only 3 examples (I know, very less, but works for a
lettagger=NLTagger(tagSchemes:[.nameType,NLTagScheme("Apple")])tagger.setModels([tagPredictor],forTagScheme:NLTagScheme("Apple"))
-
+
+
Now, we define a simple structure which the custom function(s) can use to access the provided input.
It can also be used to hold additional variables.
@@ -2727,7 +2891,8 @@ The latter can be replaced with a function which asks the user for the input.
-
structUser{
+
+
structUser{staticvarmessage=""}
@@ -2751,14 +2916,16 @@ The latter can be replaced with a function which asks the user for the input.
}}
-
+
+
Sometimes, no action needs to be performed, and the bot can use a predefined set of responses.
Otherwise, if an action is required, it can call the custom action.
-
letdefaultResponses=[
+
+
letdefaultResponses=["greetings":"Hello","banter":"no, plix no"]
@@ -2766,14 +2933,16 @@ Otherwise, if an action is required, it can call the custom action.
letcustomActions=["deez-drug":customAction]
-
+
+
In the sample input, the program is updating the User.message and checking if it has a default response.
Otherwise, it calls the custom action.
-
letsampleMessages=[
+
+
letsampleMessages=["Hey there, how is it going","hello, there","Who let the dogs out",
@@ -2793,7 +2962,8 @@ Otherwise, it calls the custom action.
print(customActions[prediction!]!())}}
-
+
+
@@ -2825,39 +2995,53 @@ Otherwise, it calls the custom action.
First we import the following if we have not imported these before
-
importcv2
+
+
importcv2importos
-
+
+
Then we read the file using OpenCV.
-
image=cv2.imread(imagePath)
-
+
+
image=cv2.imread(imagePath)
+
+
The cv2. imread() function returns a NumPy array representing the image. Therefore, we need to convert it before we can use it.
-
image_from_array=Image.fromarray(image,'RGB')
-
+
+
image_from_array=Image.fromarray(image,'RGB')
+
+
Then we resize the image
-
size_image=image_from_array.resize((50,50))
-
+
+
size_image=image_from_array.resize((50,50))
+
+
After this we create a batch consisting of only one image
-
p=np.expand_dims(size_image,0)
-
+
+
p=np.expand_dims(size_image,0)
+
+
We then convert this uint8 datatype to a float32 datatype
After you accept that you are okay with you IP address being logged, it will prompt you with updating your dns record. You need to create a new TXT record in the DNS settings for your domain.
To use the certificate with it, simply copy the cert.pem and privkey.pem to your working directory ( change the appropriate permissions ) and include them in the command
Caveats with copying the certificate: If you renew the certificate you will have to re-copy the files
]]>
@@ -3094,48 +3290,63 @@ Whenever you are looking for a dataset, always try searching on Kaggle and GitHu
This allows you to train the model on the GPU. Turicreate is built on top of Apache's MXNet Framework, for us to use GPU we need to install
a CUDA compatible MXNet package.
-
+-----------+----------+-----------+--------------+-------------------+---------------------+|Iteration|Passes|Stepsize|ElapsedTime|TrainingAccuracy|ValidationAccuracy|+-----------+----------+-----------+--------------+-------------------+---------------------+|0|2|1.000000|1.156349|0.889680|0.790036|
@@ -3145,39 +3356,50 @@ a CUDA compatible MXNet package.
|4|8|1.000000|1.814194|0.999063|0.925267||9|14|1.000000|2.507072|1.000000|0.911032|+-----------+----------+-----------+--------------+-------------------+---------------------+
-
+
+
Testing the Model
-
est_predictions=model.predict(test)
+
+
est_predictions=model.predict(test)accuracy=tc.evaluation.accuracy(test['label'],test_predictions)print(f'Topic classifier model has a testing accuracy of {accuracy*100}% ',flush=True)
-
We have just created our own Fake News Detection Model which has an accuracy of 92%!
-
example_text={"title":["Middling ‘Rise Of Skywalker’ Review Leaves Fan On Fence About Whether To Threaten To Kill Critic"],"text":["Expressing ambivalence toward the relatively balanced appraisal of the film, Star Wars fan Miles Ariely admitted Thursday that an online publication’s middling review of The Rise Of Skywalker had left him on the fence about whether he would still threaten to kill the critic who wrote it. “I’m really of two minds about this, because on the one hand, he said the new movie fails to live up to the original trilogy, which makes me at least want to throw a brick through his window with a note telling him to watch his back,” said Ariely, confirming he had already drafted an eight-page-long death threat to Stan Corimer of the website Screen-On Time, but had not yet decided whether to post it to the reviewer’s Facebook page. “On the other hand, though, he commended J.J. Abrams’ skillful pacing and faithfulness to George Lucas’ vision, which makes me wonder if I should just call the whole thing off. Now, I really don’t feel like camping outside his house for hours. Maybe I could go with a response that’s somewhere in between, like, threatening to kill his dog but not everyone in his whole family? I don’t know. This is a tough one.” At press time, sources reported that Ariely had resolved to wear his Ewok costume while he murdered the critic in his sleep."]}
+
+
example_text={"title":["Middling ‘Rise Of Skywalker’ Review Leaves Fan On Fence About Whether To Threaten To Kill Critic"],"text":["Expressing ambivalence toward the relatively balanced appraisal of the film, Star Wars fan Miles Ariely admitted Thursday that an online publication’s middling review of The Rise Of Skywalker had left him on the fence about whether he would still threaten to kill the critic who wrote it. “I’m really of two minds about this, because on the one hand, he said the new movie fails to live up to the original trilogy, which makes me at least want to throw a brick through his window with a note telling him to watch his back,” said Ariely, confirming he had already drafted an eight-page-long death threat to Stan Corimer of the website Screen-On Time, but had not yet decided whether to post it to the reviewer’s Facebook page. “On the other hand, though, he commended J.J. Abrams’ skillful pacing and faithfulness to George Lucas’ vision, which makes me wonder if I should just call the whole thing off. Now, I really don’t feel like camping outside his house for hours. Maybe I could go with a response that’s somewhere in between, like, threatening to kill his dog but not everyone in his whole family? I don’t know. This is a tough one.” At press time, sources reported that Ariely had resolved to wear his Ewok costume while he murdered the critic in his sleep."]}example_prediction=model.classify(tc.SFrame(example_text))print(example_prediction,flush=True)
-
Note: To download files from Google Colab, simply click on the files section in the sidebar, right click on filename and then click on download
@@ -3196,7 +3418,8 @@ DescriptionThe bag-of-words model is a simplifying representation used in NLP, i
We define our bag of words function
-
funcbow(text:String)->[String:Double]{
+
+
funcbow(text:String)->[String:Double]{varbagOfWords=[String:Double]()lettagger=NSLinguisticTagger(tagSchemes:[.tokenType],options:0)
@@ -3215,22 +3438,26 @@ DescriptionThe bag-of-words model is a simplifying representation used in NLP, i
returnbagOfWords}
-
Finally, we implement a simple function which reads the two text fields, creates their bag of words representation and displays an alert with the appropriate result
Complete Code
-
importSwiftUI
+
+
importSwiftUIstructContentView:View{@Stateprivatevartitle:String=""
@@ -3305,7 +3532,8 @@ DescriptionThe bag-of-words model is a simplifying representation used in NLP, i
ContentView()}}
-
+
+
]]>
@@ -3325,7 +3553,8 @@ DescriptionThe bag-of-words model is a simplifying representation used in NLP, i
If you want to directly open the HTML file in your browser after saving, don't forget to set CORS_PROXY=""
-
<!doctype html>
+
+
<!doctype html><htmllang="en"><head><metacharset="utf-8">
@@ -3520,7 +3749,8 @@ DescriptionThe bag-of-words model is a simplifying representation used in NLP, i
</script><noscript>Uh Oh! Your browser does not support JavaScript or JavaScript is currently disabled. Please enable JavaScript or switch to a different browser.</noscript></body></html>
-
+
+
]]>
@@ -3574,7 +3804,8 @@ DescriptionThe bag-of-words model is a simplifying representation used in NLP, i
Tue, 14 Jan 2020 00:10:00 -0000Converting between image and NumPy array
-
importnumpy
+
+
importnumpyimportPIL# Convert PIL Image to NumPy array
@@ -3583,16 +3814,19 @@ DescriptionThe bag-of-words model is a simplifying representation used in NLP, i
# Convert array to Imageimg=PIL.Image.fromarray(arr)
-
+-------------------------+------------------------+|path|image|+-------------------------+------------------------+|./train/default/1.jpg|Height:224Width:224|
@@ -3895,11 +4154,13 @@ return path(str, boost::filesystem::native);
[2028 rows x 3 columns]Note:OnlytheheadoftheSFrameisprinted.You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.
-
+
+
Making the Model
-
importturicreateastc
+
+
importturicreateastc# Load the datadata=tc.SFrame('fire-smoke.sframe')
@@ -3922,11 +4183,13 @@ return path(str, boost::filesystem::native);
# Export for use in Core MLmodel.export_coreml('fire-smoke.mlmodel')
-
+
+
\
-
Performing feature extraction on resized images...
+
%tensorflow_version2.x#This is for telling Colab that you want to use TF 2.0, ignore if running on local machine
+
+
%tensorflow_version2.x#This is for telling Colab that you want to use TF 2.0, ignore if running on local machinefromPILimportImage# We use the PIL Library to resize imagesimportnumpyasnp
@@ -59,21 +60,25 @@
importmatplotlib.pyplotaspltfromkeras.modelsimportSequentialfromkeras.layersimportConv2D,MaxPooling2D,Dense,Flatten,Dropout
-
We use the Adam optimiser as it is an adaptive learning rate optimisation algorithm that's been designed specifically for training deep neural networks, which means it changes its learning rate automatically to get the best results
diff --git a/docs/posts/2019-12-16-TensorFlow-Polynomial-Regression.html b/docs/posts/2019-12-16-TensorFlow-Polynomial-Regression.html
index 7bfe8d4..f0dad82 100644
--- a/docs/posts/2019-12-16-TensorFlow-Polynomial-Regression.html
+++ b/docs/posts/2019-12-16-TensorFlow-Polynomial-Regression.html
@@ -69,12 +69,14 @@ Polynomial regression even fits a non-linear relationship (e.g when the points d
df# this gives us a preview of the dataset we are working with
-
+
+
df# this gives us a preview of the dataset we are working with
+
+
-
|Position|Level|Salary|
+
+
|Position|Level|Salary||-------------------|-------|---------||BusinessAnalyst|1|45000||JuniorConsultant|2|50000|
@@ -121,81 +134,100 @@ Polynomial regression even fits a non-linear relationship (e.g when the points d
|SeniorPartner|8|300000||C-level|9|500000||CEO|10|1000000|
-
+
+
We convert the salary column as the ordinate (y-coordinate) and level column as the abscissa
@@ -204,7 +236,8 @@ values using the X values. We then plot it to compare the actual data and predic
Linear Equation
-
withtf.Session()assess:
+
+
withtf.Session()assess:sess.run(init)forepochinrange(no_of_epochs):for(x,y)inzip(abscissa,ordinate):
@@ -218,9 +251,11 @@ values using the X values. We then plot it to compare the actual data and predic
constant=sess.run(b)print(training_cost,coefficient1,constant)
-
Epoch 1000 :TrainingCost:88999125000.0a,b:180396.42-478869.12Epoch 2000 :TrainingCost:88999125000.0a,b:180396.42-478869.12Epoch 3000 :TrainingCost:88999125000.0a,b:180396.42-478869.12Epoch 4000 :TrainingCost:88999125000.0a,b:180396.42-478869.12
@@ -246,9 +281,11 @@ values using the X values. We then plot it to compare the actual data and predic
Epoch 24000 :TrainingCost:88999125000.0a,b:180396.42-478869.12Epoch 25000 :TrainingCost:88999125000.0a,b:180396.42-478869.1288999125000.0 180396.42 -478869.12
-
+
+
-
predictions=[]
+
+
predictions=[]forxinabscissa:predictions.append((coefficient1*x+constant))plt.plot(abscissa,ordinate,'ro',label='Original data')
@@ -256,13 +293,15 @@ values using the X values. We then plot it to compare the actual data and predic
plt.title('Linear Regression Result')plt.legend()plt.show()
-
+
+
Quadratic Equation
-
withtf.Session()assess:
+
+
withtf.Session()assess:sess.run(init)forepochinrange(no_of_epochs):for(x,y)inzip(abscissa,ordinate):
@@ -277,9 +316,11 @@ values using the X values. We then plot it to compare the actual data and predic
constant=sess.run(c)print(training_cost,coefficient1,coefficient2,constant)
-
Epoch 1000 :TrainingCost:52571360000.0a,b,c:1002.44561097.01971276.6921Epoch 2000 :TrainingCost:37798890000.0a,b,c:1952.42632130.28252469.7756Epoch 3000 :TrainingCost:26751185000.0a,b,c:2839.58253081.61183554.351Epoch 4000 :TrainingCost:19020106000.0a,b,c:3644.563922.95634486.3135
@@ -305,9 +346,11 @@ values using the X values. We then plot it to compare the actual data and predic
Epoch 24000 :TrainingCost:8088001000.0a,b,c:6632.963399.878-79.89219Epoch 25000 :TrainingCost:8058094600.0a,b,c:6659.7933227.2517-463.031568058094600.0 6659.793 3227.2517 -463.03156
-
+
+
-
predictions=[]
+
+
predictions=[]forxinabscissa:predictions.append((coefficient1*pow(x,2)+coefficient2*x+constant))plt.plot(abscissa,ordinate,'ro',label='Original data')
@@ -315,13 +358,15 @@ values using the X values. We then plot it to compare the actual data and predic
plt.title('Quadratic Regression Result')plt.legend()plt.show()
-
+
+
Cubic
-
withtf.Session()assess:
+
+
withtf.Session()assess:sess.run(init)forepochinrange(no_of_epochs):for(x,y)inzip(abscissa,ordinate):
@@ -337,9 +382,11 @@ values using the X values. We then plot it to compare the actual data and predic
constant=sess.run(d)print(training_cost,coefficient1,coefficient2,coefficient3,constant)
-
Epoch 1000 :TrainingCost:4279814000.0a,b,c,d:670.1527694.4212751.4653903.9527Epoch 2000 :TrainingCost:3770950400.0a,b,c,d:742.6414666.3489636.94525859.2088Epoch 3000 :TrainingCost:3717708300.0a,b,c,d:756.2582569.3339448.105748.23956Epoch 4000 :TrainingCost:3667464000.0a,b,c,d:769.4476474.0318265.5761654.75525
@@ -365,9 +412,11 @@ values using the X values. We then plot it to compare the actual data and predic
Epoch 24000 :TrainingCost:3070361300.0a,b,c,d:975.52875-1095.4292-2211.8541847.4485Epoch 25000 :TrainingCost:3052791300.0a,b,c,d:983.4346-1159.7922-2286.94122027.48573052791300.0 983.4346 -1159.7922 -2286.9412 2027.4857
-
+
+
-
predictions=[]
+
+
predictions=[]forxinabscissa:predictions.append((coefficient1*pow(x,3)+coefficient2*pow(x,2)+coefficient3*x+constant))plt.plot(abscissa,ordinate,'ro',label='Original data')
@@ -375,13 +424,15 @@ values using the X values. We then plot it to compare the actual data and predic
plt.title('Cubic Regression Result')plt.legend()plt.show()
-
+
+
Quartic
-
withtf.Session()assess:
+
+
withtf.Session()assess:sess.run(init)forepochinrange(no_of_epochs):for(x,y)inzip(abscissa,ordinate):
@@ -398,9 +449,11 @@ values using the X values. We then plot it to compare the actual data and predic
constant=sess.run(e)print(training_cost,coefficient1,coefficient2,coefficient3,coefficient4,constant)
-
Epoch 1000 :TrainingCost:1902632600.0a,b,c,d:84.4830452.21059454.791424142.51952512.0343Epoch 2000 :TrainingCost:1854316200.0a,b,c,d:88.99895513.07355714.276088223.556671056.4655Epoch 3000 :TrainingCost:1812812400.0a,b,c,d:92.9462-22.331177-15.262934327.418581634.9054Epoch 4000 :TrainingCost:1775716000.0a,b,c,d:96.42522-54.64535-35.829437449.50282239.1392
@@ -426,9 +479,11 @@ values using the X values. We then plot it to compare the actual data and predic
Epoch 24000 :TrainingCost:1252052600.0a,b,c,d:135.9583-493.3825490.2686163764.007815010.481Epoch 25000 :TrainingCost:1231713700.0a,b,c,d:137.54753-512.1876101.593723926.489715609.3681231713700.0 137.54753 -512.1876 101.59372 3926.4897 15609.368
-
+
+
-
predictions=[]
+
+
predictions=[]forxinabscissa:predictions.append((coefficient1*pow(x,4)+coefficient2*pow(x,3)+coefficient3*pow(x,2)+coefficient4*x+constant))plt.plot(abscissa,ordinate,'ro',label='Original data')
@@ -436,13 +491,15 @@ values using the X values. We then plot it to compare the actual data and predic
plt.title('Quartic Regression Result')plt.legend()plt.show()
-
+
+
Quintic
-
withtf.Session()assess:
+
+
withtf.Session()assess:sess.run(init)forepochinrange(no_of_epochs):for(x,y)inzip(abscissa,ordinate):
@@ -458,9 +515,11 @@ values using the X values. We then plot it to compare the actual data and predic
coefficient4=sess.run(d)coefficient5=sess.run(e)constant=sess.run(f)
-
Epoch 1000 :TrainingCost:1409200100.0a,b,c,d,e,f:7.9494727.4621955.626034184.29028484.002231024.0083Epoch 2000 :TrainingCost:1306882400.0a,b,c,d,e,f:8.732181-4.008589773.25298315.90103904.088872004.9749Epoch 3000 :TrainingCost:1212606000.0a,b,c,d,e,f:9.732249-16.9012586.28379437.065521305.0552966.2188Epoch 4000 :TrainingCost:1123640400.0a,b,c,d,e,f:10.74851-29.8269298.59997555.3311698.46313917.9155
@@ -486,9 +545,11 @@ values using the X values. We then plot it to compare the actual data and predic
Epoch 24000 :TrainingCost:229660080.0a,b,c,d,e,f:27.102589-238.44817309.353422420.41857770.572819536.19Epoch 25000 :TrainingCost:216972400.0a,b,c,d,e,f:27.660324-245.69016318.100622483.36087957.35420027.707216972400.0 27.660324 -245.69016 318.10062 2483.3608 7957.354 20027.707
-
+
+
-
predictions=[]
+
+
predictions=[]forxinabscissa:predictions.append((coefficient1*pow(x,5)+coefficient2*pow(x,4)+coefficient3*pow(x,3)+coefficient4*pow(x,2)+coefficient5*x+constant))plt.plot(abscissa,ordinate,'ro',label='Original data')
@@ -496,7 +557,8 @@ values using the X values. We then plot it to compare the actual data and predic
plt.title('Quintic Regression Result')plt.legend()plt.show()
-
+
+
diff --git a/docs/posts/2019-12-22-Fake-News-Detector.html b/docs/posts/2019-12-22-Fake-News-Detector.html
index 46297b0..9b62b00 100644
--- a/docs/posts/2019-12-22-Fake-News-Detector.html
+++ b/docs/posts/2019-12-22-Fake-News-Detector.html
@@ -60,48 +60,63 @@ Whenever you are looking for a dataset, always try searching on Kaggle and GitHu
This allows you to train the model on the GPU. Turicreate is built on top of Apache's MXNet Framework, for us to use GPU we need to install
a CUDA compatible MXNet package.
-
+-----------+----------+-----------+--------------+-------------------+---------------------+|Iteration|Passes|Stepsize|ElapsedTime|TrainingAccuracy|ValidationAccuracy|+-----------+----------+-----------+--------------+-------------------+---------------------+|0|2|1.000000|1.156349|0.889680|0.790036|
@@ -111,39 +126,50 @@ a CUDA compatible MXNet package.
|4|8|1.000000|1.814194|0.999063|0.925267||9|14|1.000000|2.507072|1.000000|0.911032|+-----------+----------+-----------+--------------+-------------------+---------------------+
-
+
+
Testing the Model
-
est_predictions=model.predict(test)
+
+
est_predictions=model.predict(test)accuracy=tc.evaluation.accuracy(test['label'],test_predictions)print(f'Topic classifier model has a testing accuracy of {accuracy*100}% ',flush=True)
-
We have just created our own Fake News Detection Model which has an accuracy of 92%!
-
example_text={"title":["Middling ‘Rise Of Skywalker’ Review Leaves Fan On Fence About Whether To Threaten To Kill Critic"],"text":["Expressing ambivalence toward the relatively balanced appraisal of the film, Star Wars fan Miles Ariely admitted Thursday that an online publication’s middling review of The Rise Of Skywalker had left him on the fence about whether he would still threaten to kill the critic who wrote it. “I’m really of two minds about this, because on the one hand, he said the new movie fails to live up to the original trilogy, which makes me at least want to throw a brick through his window with a note telling him to watch his back,” said Ariely, confirming he had already drafted an eight-page-long death threat to Stan Corimer of the website Screen-On Time, but had not yet decided whether to post it to the reviewer’s Facebook page. “On the other hand, though, he commended J.J. Abrams’ skillful pacing and faithfulness to George Lucas’ vision, which makes me wonder if I should just call the whole thing off. Now, I really don’t feel like camping outside his house for hours. Maybe I could go with a response that’s somewhere in between, like, threatening to kill his dog but not everyone in his whole family? I don’t know. This is a tough one.” At press time, sources reported that Ariely had resolved to wear his Ewok costume while he murdered the critic in his sleep."]}
+
+
example_text={"title":["Middling ‘Rise Of Skywalker’ Review Leaves Fan On Fence About Whether To Threaten To Kill Critic"],"text":["Expressing ambivalence toward the relatively balanced appraisal of the film, Star Wars fan Miles Ariely admitted Thursday that an online publication’s middling review of The Rise Of Skywalker had left him on the fence about whether he would still threaten to kill the critic who wrote it. “I’m really of two minds about this, because on the one hand, he said the new movie fails to live up to the original trilogy, which makes me at least want to throw a brick through his window with a note telling him to watch his back,” said Ariely, confirming he had already drafted an eight-page-long death threat to Stan Corimer of the website Screen-On Time, but had not yet decided whether to post it to the reviewer’s Facebook page. “On the other hand, though, he commended J.J. Abrams’ skillful pacing and faithfulness to George Lucas’ vision, which makes me wonder if I should just call the whole thing off. Now, I really don’t feel like camping outside his house for hours. Maybe I could go with a response that’s somewhere in between, like, threatening to kill his dog but not everyone in his whole family? I don’t know. This is a tough one.” At press time, sources reported that Ariely had resolved to wear his Ewok costume while he murdered the critic in his sleep."]}example_prediction=model.classify(tc.SFrame(example_text))print(example_prediction,flush=True)
-
Note: To download files from Google Colab, simply click on the files section in the sidebar, right click on filename and then click on download
@@ -162,7 +188,8 @@ DescriptionThe bag-of-words model is a simplifying representation used in NLP, i
We define our bag of words function
-
funcbow(text:String)->[String:Double]{
+
+
funcbow(text:String)->[String:Double]{varbagOfWords=[String:Double]()lettagger=NSLinguisticTagger(tagSchemes:[.tokenType],options:0)
@@ -181,22 +208,26 @@ DescriptionThe bag-of-words model is a simplifying representation used in NLP, i
returnbagOfWords}
-
Finally, we implement a simple function which reads the two text fields, creates their bag of words representation and displays an alert with the appropriate result
Complete Code
-
importSwiftUI
+
+
importSwiftUIstructContentView:View{@Stateprivatevartitle:String=""
@@ -271,7 +302,8 @@ DescriptionThe bag-of-words model is a simplifying representation used in NLP, i
ContentView()}}
-
+-------------------------+------------------------+|path|image|+-------------------------+------------------------+|./train/default/1.jpg|Height:224Width:224|
@@ -194,11 +213,13 @@
[2028 rows x 3 columns]Note:OnlytheheadoftheSFrameisprinted.You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.
-
+
+
Making the Model
-
importturicreateastc
+
+
importturicreateastc# Load the datadata=tc.SFrame('fire-smoke.sframe')
@@ -221,11 +242,13 @@
# Export for use in Core MLmodel.export_coreml('fire-smoke.mlmodel')
-
+
+
\
-
Performing feature extraction on resized images...
+
Create a new file called index.html in your project folder. This is the basic template we are going to use. Replace me with the root filename of your image, for example NeverGonnaGiveYouUp.png will become NeverGonnaGiveYouUp. Make sure you have copied all three files from the output folder in the previous step to the root of your project folder.
In this we are creating a AFrame scene and we are telling it that we want to use NFT Tracking. The amazing part about using AFrame is that we are able to use all AFrame objects!
After you accept that you are okay with you IP address being logged, it will prompt you with updating your dns record. You need to create a new TXT record in the DNS settings for your domain.
@@ -66,7 +70,8 @@
You can check if the TXT records have been updated by using the dig command:
-
dig navanspi.duckdns.org TXT
+
+
dig navanspi.duckdns.org TXT
; <<>> DiG 9.16.1-Ubuntu <<>> navanspi.duckdns.org TXT
;; global options: +cmd
;; Got answer:
@@ -85,7 +90,8 @@ navanspi.duckdns.org. 60 IN TXT ;; SERVER: 127.0.0.53#53(127.0.0.53);; WHEN: Tue Nov 1715:23:15 IST 2020;; MSG SIZE rcvd: 105
-
+
+
DuckDNS almost instantly propagates the changes but for other domain hosts, it could take a while.
@@ -99,13 +105,17 @@ navanspi.duckdns.org. 60 IN TXT
To use the certificate with it, simply copy the cert.pem and privkey.pem to your working directory ( change the appropriate permissions ) and include them in the command
If you want to directly open the HTML file in your browser after saving, don't forget to set CORS_PROXY=""
-
<!doctype html>
+
+
<!doctype html><htmllang="en"><head><metacharset="utf-8">
@@ -240,7 +241,8 @@
</script><noscript>Uh Oh! Your browser does not support JavaScript or JavaScript is currently disabled. Please enable JavaScript or switch to a different browser.</noscript></body></html>
-
+
+
diff --git a/docs/posts/2021-06-25-Blog2Twitter-P1.html b/docs/posts/2021-06-25-Blog2Twitter-P1.html
index ada9666..62233ab 100644
--- a/docs/posts/2021-06-25-Blog2Twitter-P1.html
+++ b/docs/posts/2021-06-25-Blog2Twitter-P1.html
@@ -57,7 +57,8 @@ I am not handling lists or images right now.
pip install tweepy
-
importos
+
+
importosimporttweepyconsumer_key=os.environ["consumer_key"]
@@ -70,13 +71,15 @@ I am not handling lists or images right now.
auth.set_access_token(access_token,access_token_secret)api=tweepy.API(auth)
-
+
+
The program need to convert the blog post into text fragments.
It reads the markdown file, removes the top YAML content, checks for headers and splits the content.
-
tweets=[]
+
+
tweets=[]first___n=0
@@ -103,13 +106,15 @@ I am not handling lists or images right now.
print("ERROR")else:tweets.append(line)
-
+
+
Every status update using tweepy has an id attached to it, for the next tweet in the thread, it adds that ID while calling the function.
For every tweet fragment, it also appends 1/n.
-
foridx,tweetinenumerate(tweets):
+
+
foridx,tweetinenumerate(tweets):tweet+=" {}/{}".format(idx+1,len(tweets))ifidx==0:a=None
@@ -118,12 +123,15 @@ I am not handling lists or images right now.
a=api.update_status(tweet,in_reply_to_status_id=a.id)print(len(tweet),end=" ")print("{}/{}\n".format(idx+1,len(tweets)))
-
+
+
Finally, it replies to the last tweet in the thread with the link of the post.
diff --git a/docs/posts/2021-06-27-Crude-ML-AI-Powered-Chatbot-Swift.html b/docs/posts/2021-06-27-Crude-ML-AI-Powered-Chatbot-Swift.html
index 0b307fd..cdae911 100644
--- a/docs/posts/2021-06-27-Crude-ML-AI-Powered-Chatbot-Swift.html
+++ b/docs/posts/2021-06-27-Crude-ML-AI-Powered-Chatbot-Swift.html
@@ -89,7 +89,8 @@ I created a sample JSON with only 3 examples (I know, very less, but works for a
-
[
+
+
[{"tokens":["Tell","me","about","the","drug","Aspirin","."],"labels":["NONE","NONE","NONE","NONE","NONE","COMPOUND","NONE"]
@@ -103,7 +104,8 @@ I created a sample JSON with only 3 examples (I know, very less, but works for a
"labels":["NONE","NONE","NONE","NONE","COMPOUND","NONE","NONE"]}]
-
+
+
@@ -113,7 +115,8 @@ I created a sample JSON with only 3 examples (I know, very less, but works for a
-
importCoreML
+
+
importCoreMLimportNaturalLanguageletmlModelClassifier=tryIntentDetection_1(configuration:MLModelConfiguration()).model
@@ -124,7 +127,8 @@ I created a sample JSON with only 3 examples (I know, very less, but works for a
lettagger=NLTagger(tagSchemes:[.nameType,NLTagScheme("Apple")])tagger.setModels([tagPredictor],forTagScheme:NLTagScheme("Apple"))
-
+
+
Now, we define a simple structure which the custom function(s) can use to access the provided input.
It can also be used to hold additional variables.
@@ -134,7 +138,8 @@ The latter can be replaced with a function which asks the user for the input.
-
structUser{
+
+
structUser{staticvarmessage=""}
@@ -158,14 +163,16 @@ The latter can be replaced with a function which asks the user for the input.
}}
-
+
+
Sometimes, no action needs to be performed, and the bot can use a predefined set of responses.
Otherwise, if an action is required, it can call the custom action.
-
letdefaultResponses=[
+
+
letdefaultResponses=["greetings":"Hello","banter":"no, plix no"]
@@ -173,14 +180,16 @@ Otherwise, if an action is required, it can call the custom action.
letcustomActions=["deez-drug":customAction]
-
+
+
In the sample input, the program is updating the User.message and checking if it has a default response.
Otherwise, it calls the custom action.
-
letsampleMessages=[
+
+
letsampleMessages=["Hey there, how is it going","hello, there","Who let the dogs out",
@@ -200,7 +209,8 @@ Otherwise, it calls the custom action.
print(customActions[prediction!]!())}}
-
(Note: I was well within the rate-limit so I did not have to slow down or implement any other measures)
@@ -263,7 +269,8 @@ As of writing this post, I did not include any other database except Trakt.
Installing the Python module (pinecone-client)
-
importpandasaspd
+
+
importpandasaspdimportpineconefromsentence_transformersimportSentenceTransformerfromtqdmimporttqdm
@@ -293,7 +300,8 @@ As of writing this post, I did not include any other database except Trakt.
str(value),embeddings[idx].tolist()))index.upsert(to_send)
-
+
+
That's it!
@@ -304,7 +312,8 @@ As of writing this post, I did not include any other database except Trakt.
To find similar items, we will first have to map the name of the movie to its trakt_id, get the embeddings we have for that id and then perform a similarity search.
It is possible that this additional step of mapping could be avoided by storing information as metadata in the index.
-
defget_trakt_id(df,title:str):
+
+
defget_trakt_id(df,title:str):rec=df[df["title"].str.lower()==movie_name.lower()]iflen(rec.trakt_id.values.tolist())>1:print(f"multiple values found... {len(rec.trakt_id.values)}")
@@ -344,11 +353,13 @@ It is possible that this additional step of mapping could be avoided by storing
"runtime":df.runtime.values[0],"year":df.year.values[0]}
-
+
+
Testing it Out
-
movie_name="Now You See Me"
+
+
movie_name="Now You See Me"movie_trakt_id=get_trakt_id(df,movie_name)print(movie_trakt_id)
@@ -360,7 +371,8 @@ It is possible that this additional step of mapping could be avoided by storing
fortrakt_idinmovie_ids:deets=get_deets_by_trakt_id(df,trakt_id)print(f"{deets['title']} ({deets['year']}): {deets['overview']}")
-
+
+
Output:
diff --git a/docs/posts/2022-11-07-a-new-method-to-blog.html b/docs/posts/2022-11-07-a-new-method-to-blog.html
new file mode 100644
index 0000000..aa209b2
--- /dev/null
+++ b/docs/posts/2022-11-07-a-new-method-to-blog.html
@@ -0,0 +1,90 @@
+
+
+
+
+
+
+
+
+ Hey - Post - A new method to blog
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
A new method to blog
+
+
Paper Website is a service that lets you build a website with just pen and paper. I am going to try and replicate the process.
+
+
The Plan
+
+
The continuity feature on macOS + iOS lets you scan PDFs directly from your iPhone. I want to be able to scan these pages and automatically run an Automator script that takes the PDF and OCRs the text. Then I can further clean the text and convert from markdown.
+
+
Challenges
+
+
I quickly realised that the OCR software I planned on using could not detect my shitty handwriting accurately. I tried using ABBY Finereader, Prizmo and OCRMyPDF. (Abby Finereader and Prizmo support being automated by Automator).
+
+
Now, I could either write neater, or use an external API like Microsoft Azure
+
+
Solution
+
+
OCR
+
+
In the PDFs, all the scans are saved as images on a page. I extract the image and then send it to Azure's API.
+
+
Paragraph Breaks
+
+
The recognised text had multiple lines breaking in the middle of the sentence, Therefore, I use what is called a pilcrow to specify paragraph breaks. But, rather than trying to draw the normal pilcrow, I just use the HTML entity ¶ which is the pilcrow character.
+
+
Where is the code?
+
+
I created a GitHub Gist for a sample Python script to take the PDF and print the text
+
+
A more complete version with Auomator scripts and an entire publishing pipeline will be available as a GitHub and Gitea repo soon.
+
+
* In Part 2, I will discuss some more features *
+
+
+
+
+
+
+
+
+
+
+
+
+
\ No newline at end of file
diff --git a/docs/posts/index.html b/docs/posts/index.html
index 1698150..f4fab83 100644
--- a/docs/posts/index.html
+++ b/docs/posts/index.html
@@ -50,6 +50,21 @@