From f5faa2ef095f035110f83e17da0b35d3a34d6b97 Mon Sep 17 00:00:00 2001 From: Navan Chauhan Date: Sat, 17 Feb 2024 19:52:53 -0700 Subject: bump --- docs/posts/2022-11-07-a-new-method-to-blog.html | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) (limited to 'docs/posts/2022-11-07-a-new-method-to-blog.html') diff --git a/docs/posts/2022-11-07-a-new-method-to-blog.html b/docs/posts/2022-11-07-a-new-method-to-blog.html index 3eb3b7e..427acd9 100644 --- a/docs/posts/2022-11-07-a-new-method-to-blog.html +++ b/docs/posts/2022-11-07-a-new-method-to-blog.html @@ -26,7 +26,7 @@ - + @@ -35,6 +35,7 @@ home | about/links | posts | +3D designs | RSS Feed | -- cgit v1.2.3 From f6d2141a480dd6b5b8ee0e48d43bb64773232791 Mon Sep 17 00:00:00 2001 From: Navan Chauhan Date: Tue, 26 Mar 2024 23:38:14 -0600 Subject: add header ids --- docs/posts/2022-11-07-a-new-method-to-blog.html | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) (limited to 'docs/posts/2022-11-07-a-new-method-to-blog.html') diff --git a/docs/posts/2022-11-07-a-new-method-to-blog.html b/docs/posts/2022-11-07-a-new-method-to-blog.html index 427acd9..cbf80ec 100644 --- a/docs/posts/2022-11-07-a-new-method-to-blog.html +++ b/docs/posts/2022-11-07-a-new-method-to-blog.html @@ -6,13 +6,13 @@ - A new method to blog + id="a-new-method-to-blog">A new method to blog - - + A new method to blog" /> + A new method to blog" /> @@ -44,33 +44,33 @@
-

A new method to blog

+

A new method to blog

Here is the original PDF. I made some edits to the content after generating the markdown file

Paper Website is a service that lets you build a website with just pen and paper. I am going to try and replicate the process.

-

The Plan

+

The Plan

The continuity feature on macOS + iOS lets you scan PDFs directly from your iPhone. I want to be able to scan these pages and automatically run an Automator script that takes the PDF and OCRs the text. Then I can further clean the text and convert from markdown.

-

Challenges

+

Challenges

I quickly realised that the OCR software I planned on using could not detect my shitty handwriting accurately. I tried using ABBY Finereader, Prizmo and OCRMyPDF. (Abby Finereader and Prizmo support being automated by Automator).

Now, I could either write neater, or use an external API like Microsoft Azure

-

Solution

+

Solution

-

OCR

+

OCR

In the PDFs, all the scans are saved as images on a page. I extract the image and then send it to Azure's API.

-

Paragraph Breaks

+

Paragraph Breaks

The recognised text had multiple lines breaking in the middle of the sentence, Therefore, I use what is called a pilcrow to specify paragraph breaks. But, rather than trying to draw the normal pilcrow, I just use the HTML entity ¶ which is the pilcrow character.

-

Where is the code?

+

Where is the code?

I created a GitHub Gist for a sample Python script to take the PDF and print the text

-- cgit v1.2.3 From 9e620084e57378952c1a7f8e0a772ebebd18932b Mon Sep 17 00:00:00 2001 From: Navan Chauhan Date: Wed, 27 Mar 2024 20:35:09 -0600 Subject: quick fix --- docs/posts/2022-11-07-a-new-method-to-blog.html | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) (limited to 'docs/posts/2022-11-07-a-new-method-to-blog.html') diff --git a/docs/posts/2022-11-07-a-new-method-to-blog.html b/docs/posts/2022-11-07-a-new-method-to-blog.html index cbf80ec..7f30c72 100644 --- a/docs/posts/2022-11-07-a-new-method-to-blog.html +++ b/docs/posts/2022-11-07-a-new-method-to-blog.html @@ -6,13 +6,13 @@ - id="a-new-method-to-blog">A new method to blog + A new method to blog - A new method to blog" /> - A new method to blog" /> + + -- cgit v1.2.3 From 01ff93c9c16867216f2d249664803860e1d6d5eb Mon Sep 17 00:00:00 2001 From: Navan Chauhan Date: Wed, 27 Mar 2024 22:49:40 -0600 Subject: generate new theme --- docs/posts/2022-11-07-a-new-method-to-blog.html | 55 +++++++++++++++++-------- 1 file changed, 37 insertions(+), 18 deletions(-) (limited to 'docs/posts/2022-11-07-a-new-method-to-blog.html') diff --git a/docs/posts/2022-11-07-a-new-method-to-blog.html b/docs/posts/2022-11-07-a-new-method-to-blog.html index 7f30c72..9f4ce15 100644 --- a/docs/posts/2022-11-07-a-new-method-to-blog.html +++ b/docs/posts/2022-11-07-a-new-method-to-blog.html @@ -2,14 +2,26 @@ - + + + + + A new method to blog + + + + + + + - A new method to blog - @@ -29,21 +41,27 @@ - -
-
+ + +
-
- +

A new method to blog

Here is the original PDF. I made some edits to the content after generating the markdown file

@@ -78,14 +96,15 @@

* In Part 2, I will discuss some more features *

+
If you have scrolled this far, consider subscribing to my mailing list here. You can subscribe to either a specific type of post you are interested in, or subscribe to everything with the "Everything" list.
-
+
-- cgit v1.2.3 From de19543d7fb44d343b052dc9b34ede78620c4a46 Mon Sep 17 00:00:00 2001 From: Navan Chauhan Date: Wed, 27 Mar 2024 23:36:55 -0600 Subject: Generate --- docs/posts/2022-11-07-a-new-method-to-blog.html | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) (limited to 'docs/posts/2022-11-07-a-new-method-to-blog.html') diff --git a/docs/posts/2022-11-07-a-new-method-to-blog.html b/docs/posts/2022-11-07-a-new-method-to-blog.html index 9f4ce15..36aa737 100644 --- a/docs/posts/2022-11-07-a-new-method-to-blog.html +++ b/docs/posts/2022-11-07-a-new-method-to-blog.html @@ -46,7 +46,7 @@ + +
-- cgit v1.2.3 From a982ceab0b45609991179b3020a00260eed6f798 Mon Sep 17 00:00:00 2001 From: Navan Chauhan Date: Wed, 27 Mar 2024 23:45:59 -0600 Subject: css --- docs/posts/2022-11-07-a-new-method-to-blog.html | 1 + 1 file changed, 1 insertion(+) (limited to 'docs/posts/2022-11-07-a-new-method-to-blog.html') diff --git a/docs/posts/2022-11-07-a-new-method-to-blog.html b/docs/posts/2022-11-07-a-new-method-to-blog.html index 36aa737..1f477bd 100644 --- a/docs/posts/2022-11-07-a-new-method-to-blog.html +++ b/docs/posts/2022-11-07-a-new-method-to-blog.html @@ -5,6 +5,7 @@ + A new method to blog -- cgit v1.2.3