diff options
Diffstat (limited to 'docs/posts/2022-11-07-a-new-method-to-blog.html')
-rw-r--r-- | docs/posts/2022-11-07-a-new-method-to-blog.html | 83 |
1 files changed, 58 insertions, 25 deletions
diff --git a/docs/posts/2022-11-07-a-new-method-to-blog.html b/docs/posts/2022-11-07-a-new-method-to-blog.html index 3eb3b7e..1f477bd 100644 --- a/docs/posts/2022-11-07-a-new-method-to-blog.html +++ b/docs/posts/2022-11-07-a-new-method-to-blog.html @@ -2,14 +2,27 @@ <html lang="en"> <head> - <link rel="stylesheet" href="https://unpkg.com/latex.css/style.min.css" /> + <meta http-equiv="X-UA-Compatible" content="IE=edge"> + <meta http-equiv="content-type" content="text/html; charset=utf-8"> + <meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1"> + <meta name="theme-color" content="#6a9fb5"> + + <title>A new method to blog</title> + + <!-- + <link rel="stylesheet" href="https://unpkg.com/latex.css/style.min.css" /> + --> + + <link rel="stylesheet" href="/assets/c-hyde.css" /> + + <link rel="stylesheet" href="http://fonts.googleapis.com/css?family=PT+Sans:400,400italic,700|Abril+Fatface"> + <link rel="stylesheet" href="/assets/main.css" /> <meta charset="utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> - <title>A new method to blog</title> <meta name="og:site_name" content="Navan Chauhan" /> <link rel="canonical" href="https://web.navan.dev/posts/2022-11-07-a-new-method-to-blog.html" /> - <meta name="twitter:url" content="https://web.navan.dev/posts/2022-11-07-a-new-method-to-blog.html /> + <meta name="twitter:url" content="https://web.navan.dev/posts/2022-11-07-a-new-method-to-blog.html" /> <meta name="og:url" content="https://web.navan.dev/posts/2022-11-07-a-new-method-to-blog.html" /> <meta name="twitter:title" content="A new method to blog" /> <meta name="og:title" content="A new method to blog" /> @@ -26,50 +39,69 @@ <script data-goatcounter="https://navanchauhan.goatcounter.com/count" async src="//gc.zgo.at/count.js"></script> <script defer data-domain="web.navan.dev" src="https://plausible.io/js/plausible.js"></script> - <link rel="manifest" href="manifest.json" /> + <link rel="manifest" href="/manifest.json" /> </head> -<body> - <center><nav style="display: block;"> -| -<a href="/">home</a> | -<a href="/about/">about/links</a> | -<a href="/posts/">posts</a> | -<!--<a href="/publications/">publications</a> |--> -<!--<a href="/repo/">iOS repo</a> |--> -<a href="/feed.rss">RSS Feed</a> | -</nav> -</center> +<body class="theme-base-0d"> + <div class="sidebar"> + <div class="container sidebar-sticky"> + <div class="sidebar-about"> + <h1><a href="/">Navan</a></h1> + <p class="lead" id="random-lead">Alea iacta est.</p> + </div> + + <ul class="sidebar-nav"> + <li><a class="sidebar-nav-item" href="/about/">about/links</a></li> + <li><a class="sidebar-nav-item" href="/posts/">posts</a></li> + <li><a class="sidebar-nav-item" href="/3D-Designs/">3D designs</a></li> + <li><a class="sidebar-nav-item" href="/feed.rss">RSS Feed</a></li> + <li><a class="sidebar-nav-item" href="/colophon/">colophon</a></li> + </ul> + <div class="copyright"><p>© 2019-2024. Navan Chauhan <br> <a href="/feed.rss">RSS</a></p></div> + </div> +</div> + +<script> +let phrases = [ + "Something Funny", "Veni, vidi, vici", "Alea iacta est", "In vino veritas", "Acta, non verba", "Castigat ridendo mores", + "Cui bono?", "Memento vivere", "अहम् ब्रह्मास्मि", "अनुगच्छतु प्रवाहं", "चरन्मार्गान्विजानाति", "coq de cheval", "我愛啤酒" + ]; + +let new_phrase = phrases[Math.floor(Math.random()*phrases.length)]; + +let lead = document.getElementById("random-lead"); +lead.innerText = new_phrase; +</script> + <div class="content container"> -<main> - - <h1>A new method to blog</h1> + <div class="post"> + <h1 id="a-new-method-to-blog">A new method to blog</h1> <p><em><a rel="noopener" target="_blank" href="/assets/pdfs/2022-11-07-a-new-way-to-blog.pdf">Here</a> is the original PDF. I made some edits to the content after generating the markdown file</em></p> <p><a rel="noopener" target="_blank" href="https://paperwebsite.com">Paper Website</a> is a service that lets you build a website with just pen and paper. I am going to try and replicate the process.</p> -<h2>The Plan</h2> +<h2 id="the-plan">The Plan</h2> <p>The continuity feature on macOS + iOS lets you scan PDFs directly from your iPhone. I want to be able to scan these pages and automatically run an Automator script that takes the PDF and OCRs the text. Then I can further clean the text and convert from markdown.</p> -<h2>Challenges</h2> +<h2 id="challenges">Challenges</h2> <p>I quickly realised that the OCR software I planned on using could not detect my shitty handwriting accurately. I tried using ABBY Finereader, Prizmo and OCRMyPDF. (Abby Finereader and Prizmo support being automated by Automator).</p> <p>Now, I could either write neater, or use an external API like Microsoft Azure</p> -<h2>Solution</h2> +<h2 id="solution">Solution</h2> -<h3>OCR</h3> +<h3 id="ocr">OCR</h3> <p>In the PDFs, all the scans are saved as images on a page. I extract the image and then send it to Azure's API. </p> -<h3>Paragraph Breaks</h3> +<h3 id="paragraph-breaks">Paragraph Breaks</h3> <p>The recognised text had multiple lines breaking in the middle of the sentence, Therefore, I use what is called a <a rel="noopener" target="_blank" href="https://en.wikipedia.org/wiki/Pilcrow">pilcrow</a> to specify paragraph breaks. But, rather than trying to draw the normal pilcrow, I just use the HTML entity <code>&#182;</code> which is the pilcrow character. </p> -<h2>Where is the code?</h2> +<h2 id="where-is-the-code">Where is the code?</h2> <p>I created a <a rel="noopener" target="_blank" href="https://gist.github.com/navanchauhan/5fc602b1e023b60a66bc63bd4eecd4f8">GitHub Gist</a> for a sample Python script to take the PDF and print the text </p> @@ -77,14 +109,15 @@ <p><em>* In Part 2, I will discuss some more features *</em> </p> + </div> <blockquote>If you have scrolled this far, consider subscribing to my mailing list <a href="https://listmonk.navan.dev/subscription/form">here.</a> You can subscribe to either a specific type of post you are interested in, or subscribe to everything with the "Everything" list.</blockquote> <script data-isso="https://comments.navan.dev/" src="https://comments.navan.dev/js/embed.min.js"></script> <section id="isso-thread"> <noscript>Javascript needs to be activated to view comments.</noscript> </section> -</main> + </div> <script src="assets/manup.min.js"></script> <script src="/pwabuilder-sw-register.js"></script> </body> |