summaryrefslogtreecommitdiff
path: root/posts/2020-07-01-Install-rdkit-colab/index.html
blob: 963add14bdf721442c3b92c95e26f6d1c76c5d2a (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
<!DOCTYPE html><html lang="en"><head><meta charset="UTF-8"/><meta name="og:site_name" content="Navan Chauhan"/><link rel="canonical" href="https://navanchauhan.github.io/posts/2020-07-01-Install-rdkit-colab"/><meta name="twitter:url" content="https://navanchauhan.github.io/posts/2020-07-01-Install-rdkit-colab"/><meta name="og:url" content="https://navanchauhan.github.io/posts/2020-07-01-Install-rdkit-colab"/><title>Installing RDKit on Google Colab | Navan Chauhan</title><meta name="twitter:title" content="Installing RDKit on Google Colab | Navan Chauhan"/><meta name="og:title" content="Installing RDKit on Google Colab | Navan Chauhan"/><meta name="description" content="Install RDKit on Google Colab with one code snippet."/><meta name="twitter:description" content="Install RDKit on Google Colab with one code snippet."/><meta name="og:description" content="Install RDKit on Google Colab with one code snippet."/><meta name="twitter:card" content="summary"/><link rel="stylesheet" href="/styles.css" type="text/css"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/><link rel="shortcut icon" href="/images/favicon.png" type="image/png"/><link rel="alternate" href="/feed.rss" type="application/rss+xml" title="Subscribe to Navan Chauhan"/><meta name="twitter:image" content="https://navanchauhan.github.io/images/logo.png"/><meta name="og:image" content="https://navanchauhan.github.io/images/logo.png"/></head><head><script async src="//gc.zgo.at/count.js" data-goatcounter="https://navanchauhan.goatcounter.com/count"></script></head><body class="item-page"><header><div class="wrapper"><a class="site-name" href="/">Navan Chauhan</a><nav><ul><li><a href="/about">About Me</a></li><li><a class="selected" href="/posts">Posts</a></li><li><a href="/publications">Publications</a></li><li><a href="/assets/résumé.pdf">Résumé</a></li><li><a href="https://navanchauhan.github.io/repo">Repo</a></li><li><a href="/feed.rss">RSS Feed</a></li></ul></nav></div></header><div class="wrapper"><article><div class="content"><span class="reading-time">2 minute read</span><span class="reading-time">Created on July 1, 2020</span><span class="reading-time">Last modified on September 15, 2020</span><h1>Installing RDKit on Google Colab</h1><p>RDKit is one of the most integral part of any Cheminfomatic specialist's toolkit but it is notoriously difficult to install unless you already have <code>conda</code> installed. I originally found this in a GitHub Gist but I have not been able to find that gist again :/</p><p>Just copy and paste this in a Colab cell and it will install it 👍</p><pre><code><div class="highlight"><span></span><span class="kn">import</span> <span class="nn">sys</span>
<span class="kn">import</span> <span class="nn">os</span>
<span class="kn">import</span> <span class="nn">requests</span>
<span class="kn">import</span> <span class="nn">subprocess</span>
<span class="kn">import</span> <span class="nn">shutil</span>
<span class="kn">from</span> <span class="nn">logging</span> <span class="kn">import</span> <span class="n">getLogger</span><span class="p">,</span> <span class="n">StreamHandler</span><span class="p">,</span> <span class="n">INFO</span>


<span class="n">logger</span> <span class="o">=</span> <span class="n">getLogger</span><span class="p">(</span><span class="vm">__name__</span><span class="p">)</span>
<span class="n">logger</span><span class="o">.</span><span class="n">addHandler</span><span class="p">(</span><span class="n">StreamHandler</span><span class="p">())</span>
<span class="n">logger</span><span class="o">.</span><span class="n">setLevel</span><span class="p">(</span><span class="n">INFO</span><span class="p">)</span>


<span class="k">def</span> <span class="nf">install</span><span class="p">(</span>
        <span class="n">chunk_size</span><span class="o">=</span><span class="mi">4096</span><span class="p">,</span>
        <span class="n">file_name</span><span class="o">=</span><span class="s2">&quot;Miniconda3-latest-Linux-x86_64.sh&quot;</span><span class="p">,</span>
        <span class="n">url_base</span><span class="o">=</span><span class="s2">&quot;https://repo.continuum.io/miniconda/&quot;</span><span class="p">,</span>
        <span class="n">conda_path</span><span class="o">=</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">expanduser</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="s2">&quot;~&quot;</span><span class="p">,</span> <span class="s2">&quot;miniconda&quot;</span><span class="p">)),</span>
        <span class="n">rdkit_version</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span>
        <span class="n">add_python_path</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span>
        <span class="n">force</span><span class="o">=</span><span class="kc">False</span><span class="p">):</span>
    <span class="sd">&quot;&quot;&quot;install rdkit from miniconda</span>
<span class="sd">    ```</span>
<span class="sd">    import rdkit_installer</span>
<span class="sd">    rdkit_installer.install()</span>
<span class="sd">    ```</span>
<span class="sd">    &quot;&quot;&quot;</span>

    <span class="n">python_path</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">join</span><span class="p">(</span>
        <span class="n">conda_path</span><span class="p">,</span>
        <span class="s2">&quot;lib&quot;</span><span class="p">,</span>
        <span class="s2">&quot;python</span><span class="si">{0}</span><span class="s2">.</span><span class="si">{1}</span><span class="s2">&quot;</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="o">*</span><span class="n">sys</span><span class="o">.</span><span class="n">version_info</span><span class="p">),</span>
        <span class="s2">&quot;site-packages&quot;</span><span class="p">,</span>
    <span class="p">)</span>

    <span class="k">if</span> <span class="n">add_python_path</span> <span class="ow">and</span> <span class="n">python_path</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">sys</span><span class="o">.</span><span class="n">path</span><span class="p">:</span>
        <span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s2">&quot;add </span><span class="si">{}</span><span class="s2"> to PYTHONPATH&quot;</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">python_path</span><span class="p">))</span>
        <span class="n">sys</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">python_path</span><span class="p">)</span>

    <span class="k">if</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">isdir</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">python_path</span><span class="p">,</span> <span class="s2">&quot;rdkit&quot;</span><span class="p">)):</span>
        <span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s2">&quot;rdkit is already installed&quot;</span><span class="p">)</span>
        <span class="k">if</span> <span class="ow">not</span> <span class="n">force</span><span class="p">:</span>
            <span class="k">return</span>

        <span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s2">&quot;force re-install&quot;</span><span class="p">)</span>

    <span class="n">url</span> <span class="o">=</span> <span class="n">url_base</span> <span class="o">+</span> <span class="n">file_name</span>
    <span class="n">python_version</span> <span class="o">=</span> <span class="s2">&quot;</span><span class="si">{0}</span><span class="s2">.</span><span class="si">{1}</span><span class="s2">.</span><span class="si">{2}</span><span class="s2">&quot;</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="o">*</span><span class="n">sys</span><span class="o">.</span><span class="n">version_info</span><span class="p">)</span>

    <span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s2">&quot;python version: </span><span class="si">{}</span><span class="s2">&quot;</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">python_version</span><span class="p">))</span>

    <span class="k">if</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">isdir</span><span class="p">(</span><span class="n">conda_path</span><span class="p">):</span>
        <span class="n">logger</span><span class="o">.</span><span class="n">warning</span><span class="p">(</span><span class="s2">&quot;remove current miniconda&quot;</span><span class="p">)</span>
        <span class="n">shutil</span><span class="o">.</span><span class="n">rmtree</span><span class="p">(</span><span class="n">conda_path</span><span class="p">)</span>
    <span class="k">elif</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">isfile</span><span class="p">(</span><span class="n">conda_path</span><span class="p">):</span>
        <span class="n">logger</span><span class="o">.</span><span class="n">warning</span><span class="p">(</span><span class="s2">&quot;remove </span><span class="si">{}</span><span class="s2">&quot;</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">conda_path</span><span class="p">))</span>
        <span class="n">os</span><span class="o">.</span><span class="n">remove</span><span class="p">(</span><span class="n">conda_path</span><span class="p">)</span>

    <span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">&#39;fetching installer from </span><span class="si">{}</span><span class="s1">&#39;</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">url</span><span class="p">))</span>
    <span class="n">res</span> <span class="o">=</span> <span class="n">requests</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">url</span><span class="p">,</span> <span class="n">stream</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
    <span class="n">res</span><span class="o">.</span><span class="n">raise_for_status</span><span class="p">()</span>
    <span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">file_name</span><span class="p">,</span> <span class="s1">&#39;wb&#39;</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
        <span class="k">for</span> <span class="n">chunk</span> <span class="ow">in</span> <span class="n">res</span><span class="o">.</span><span class="n">iter_content</span><span class="p">(</span><span class="n">chunk_size</span><span class="p">):</span>
            <span class="n">f</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="n">chunk</span><span class="p">)</span>
    <span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">&#39;done&#39;</span><span class="p">)</span>

    <span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">&#39;installing miniconda to </span><span class="si">{}</span><span class="s1">&#39;</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">conda_path</span><span class="p">))</span>
    <span class="n">subprocess</span><span class="o">.</span><span class="n">check_call</span><span class="p">([</span><span class="s2">&quot;bash&quot;</span><span class="p">,</span> <span class="n">file_name</span><span class="p">,</span> <span class="s2">&quot;-b&quot;</span><span class="p">,</span> <span class="s2">&quot;-p&quot;</span><span class="p">,</span> <span class="n">conda_path</span><span class="p">])</span>
    <span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">&#39;done&#39;</span><span class="p">)</span>

    <span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s2">&quot;installing rdkit&quot;</span><span class="p">)</span>
    <span class="n">subprocess</span><span class="o">.</span><span class="n">check_call</span><span class="p">([</span>
        <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">conda_path</span><span class="p">,</span> <span class="s2">&quot;bin&quot;</span><span class="p">,</span> <span class="s2">&quot;conda&quot;</span><span class="p">),</span>
        <span class="s2">&quot;install&quot;</span><span class="p">,</span>
        <span class="s2">&quot;--yes&quot;</span><span class="p">,</span>
        <span class="s2">&quot;-c&quot;</span><span class="p">,</span> <span class="s2">&quot;rdkit&quot;</span><span class="p">,</span>
        <span class="s2">&quot;python==</span><span class="si">{}</span><span class="s2">&quot;</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">python_version</span><span class="p">),</span>
        <span class="s2">&quot;rdkit&quot;</span> <span class="k">if</span> <span class="n">rdkit_version</span> <span class="ow">is</span> <span class="kc">None</span> <span class="k">else</span> <span class="s2">&quot;rdkit==</span><span class="si">{}</span><span class="s2">&quot;</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">rdkit_version</span><span class="p">)])</span>
    <span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s2">&quot;done&quot;</span><span class="p">)</span>

    <span class="kn">import</span> <span class="nn">rdkit</span>
    <span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s2">&quot;rdkit-</span><span class="si">{}</span><span class="s2"> installation finished!&quot;</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">rdkit</span><span class="o">.</span><span class="n">__version__</span><span class="p">))</span>


<span class="k">if</span> <span class="vm">__name__</span> <span class="o">==</span> <span class="s2">&quot;__main__&quot;</span><span class="p">:</span>
    <span class="n">install</span><span class="p">()</span>
</div></code></pre></div><span>Tagged with: </span><ul class="tag-list"><li><a href="/tags/tutorial">Tutorial</a></li><li><a href="/tags/codesnippet">Code-Snippet</a></li><li><a href="/tags/colab">Colab</a></li></ul></article></div><footer><p>Made with ❤️ using <a href="https://github.com/johnsundell/publish">Publish</a></p><p><a href="/feed.rss">RSS feed</a></p></footer></body></html>