1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
|
<!DOCTYPE html><html lang="en"><head><meta charset="UTF-8"/><meta name="og:site_name" content="Navan Chauhan"/><link rel="canonical" href="https://navanchauhan.github.io/posts/2020-07-01-Install-rdkit-colab"/><meta name="twitter:url" content="https://navanchauhan.github.io/posts/2020-07-01-Install-rdkit-colab"/><meta name="og:url" content="https://navanchauhan.github.io/posts/2020-07-01-Install-rdkit-colab"/><title>Installing RDKit on Google Colab | Navan Chauhan</title><meta name="twitter:title" content="Installing RDKit on Google Colab | Navan Chauhan"/><meta name="og:title" content="Installing RDKit on Google Colab | Navan Chauhan"/><meta name="description" content="Install RDKit on Google Colab with one code snippet."/><meta name="twitter:description" content="Install RDKit on Google Colab with one code snippet."/><meta name="og:description" content="Install RDKit on Google Colab with one code snippet."/><meta name="twitter:card" content="summary"/><link rel="stylesheet" href="/styles.css" type="text/css"/><meta name="viewport" content="width=device-width, initial-scale=1.0"/><link rel="shortcut icon" href="/images/favicon.png" type="image/png"/><link rel="alternate" href="/feed.rss" type="application/rss+xml" title="Subscribe to Navan Chauhan"/><meta name="twitter:image" content="https://navanchauhan.github.io/images/logo.png"/><meta name="og:image" content="https://navanchauhan.github.io/images/logo.png"/></head><head><script async src="//gc.zgo.at/count.js" data-goatcounter="https://navanchauhan.goatcounter.com/count"></script></head><body class="item-page"><header><div class="wrapper"><a class="site-name" href="/">Navan Chauhan</a><nav><ul><li><a href="/about">About Me</a></li><li><a class="selected" href="/posts">Posts</a></li><li><a href="/publications">Publications</a></li><li><a href="/assets/résumé.pdf">Résumé</a></li><li><a href="https://navanchauhan.github.io/repo">Repo</a></li><li><a href="/feed.rss">RSS</a></li></ul></nav></div></header><div class="wrapper"><article><div class="content"><span class="reading-time">2 minute read</span><span class="reading-time">Created on July 1, 2020</span><span class="reading-time">Last modified on September 15, 2020</span><h1>Installing RDKit on Google Colab</h1><p>RDKit is one of the most integral part of any Cheminfomatic specialist's toolkit but it is notoriously difficult to install unless you already have <code>conda</code> installed. I originally found this in a GitHub Gist but I have not been able to find that gist again :/</p><p>Just copy and paste this in a Colab cell and it will install it 👍</p><pre><code><div class="highlight"><span></span><span class="kn">import</span> <span class="nn">sys</span>
<span class="kn">import</span> <span class="nn">os</span>
<span class="kn">import</span> <span class="nn">requests</span>
<span class="kn">import</span> <span class="nn">subprocess</span>
<span class="kn">import</span> <span class="nn">shutil</span>
<span class="kn">from</span> <span class="nn">logging</span> <span class="kn">import</span> <span class="n">getLogger</span><span class="p">,</span> <span class="n">StreamHandler</span><span class="p">,</span> <span class="n">INFO</span>
<span class="n">logger</span> <span class="o">=</span> <span class="n">getLogger</span><span class="p">(</span><span class="vm">__name__</span><span class="p">)</span>
<span class="n">logger</span><span class="o">.</span><span class="n">addHandler</span><span class="p">(</span><span class="n">StreamHandler</span><span class="p">())</span>
<span class="n">logger</span><span class="o">.</span><span class="n">setLevel</span><span class="p">(</span><span class="n">INFO</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">install</span><span class="p">(</span>
<span class="n">chunk_size</span><span class="o">=</span><span class="mi">4096</span><span class="p">,</span>
<span class="n">file_name</span><span class="o">=</span><span class="s2">"Miniconda3-latest-Linux-x86_64.sh"</span><span class="p">,</span>
<span class="n">url_base</span><span class="o">=</span><span class="s2">"https://repo.continuum.io/miniconda/"</span><span class="p">,</span>
<span class="n">conda_path</span><span class="o">=</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">expanduser</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="s2">"~"</span><span class="p">,</span> <span class="s2">"miniconda"</span><span class="p">)),</span>
<span class="n">rdkit_version</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span>
<span class="n">add_python_path</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span>
<span class="n">force</span><span class="o">=</span><span class="kc">False</span><span class="p">):</span>
<span class="sd">"""install rdkit from miniconda</span>
<span class="sd"> ```</span>
<span class="sd"> import rdkit_installer</span>
<span class="sd"> rdkit_installer.install()</span>
<span class="sd"> ```</span>
<span class="sd"> """</span>
<span class="n">python_path</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">join</span><span class="p">(</span>
<span class="n">conda_path</span><span class="p">,</span>
<span class="s2">"lib"</span><span class="p">,</span>
<span class="s2">"python</span><span class="si">{0}</span><span class="s2">.</span><span class="si">{1}</span><span class="s2">"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="o">*</span><span class="n">sys</span><span class="o">.</span><span class="n">version_info</span><span class="p">),</span>
<span class="s2">"site-packages"</span><span class="p">,</span>
<span class="p">)</span>
<span class="k">if</span> <span class="n">add_python_path</span> <span class="ow">and</span> <span class="n">python_path</span> <span class="ow">not</span> <span class="ow">in</span> <span class="n">sys</span><span class="o">.</span><span class="n">path</span><span class="p">:</span>
<span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s2">"add </span><span class="si">{}</span><span class="s2"> to PYTHONPATH"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">python_path</span><span class="p">))</span>
<span class="n">sys</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">python_path</span><span class="p">)</span>
<span class="k">if</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">isdir</span><span class="p">(</span><span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">python_path</span><span class="p">,</span> <span class="s2">"rdkit"</span><span class="p">)):</span>
<span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s2">"rdkit is already installed"</span><span class="p">)</span>
<span class="k">if</span> <span class="ow">not</span> <span class="n">force</span><span class="p">:</span>
<span class="k">return</span>
<span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s2">"force re-install"</span><span class="p">)</span>
<span class="n">url</span> <span class="o">=</span> <span class="n">url_base</span> <span class="o">+</span> <span class="n">file_name</span>
<span class="n">python_version</span> <span class="o">=</span> <span class="s2">"</span><span class="si">{0}</span><span class="s2">.</span><span class="si">{1}</span><span class="s2">.</span><span class="si">{2}</span><span class="s2">"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="o">*</span><span class="n">sys</span><span class="o">.</span><span class="n">version_info</span><span class="p">)</span>
<span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s2">"python version: </span><span class="si">{}</span><span class="s2">"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">python_version</span><span class="p">))</span>
<span class="k">if</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">isdir</span><span class="p">(</span><span class="n">conda_path</span><span class="p">):</span>
<span class="n">logger</span><span class="o">.</span><span class="n">warning</span><span class="p">(</span><span class="s2">"remove current miniconda"</span><span class="p">)</span>
<span class="n">shutil</span><span class="o">.</span><span class="n">rmtree</span><span class="p">(</span><span class="n">conda_path</span><span class="p">)</span>
<span class="k">elif</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">isfile</span><span class="p">(</span><span class="n">conda_path</span><span class="p">):</span>
<span class="n">logger</span><span class="o">.</span><span class="n">warning</span><span class="p">(</span><span class="s2">"remove </span><span class="si">{}</span><span class="s2">"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">conda_path</span><span class="p">))</span>
<span class="n">os</span><span class="o">.</span><span class="n">remove</span><span class="p">(</span><span class="n">conda_path</span><span class="p">)</span>
<span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'fetching installer from </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">url</span><span class="p">))</span>
<span class="n">res</span> <span class="o">=</span> <span class="n">requests</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">url</span><span class="p">,</span> <span class="n">stream</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span>
<span class="n">res</span><span class="o">.</span><span class="n">raise_for_status</span><span class="p">()</span>
<span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">file_name</span><span class="p">,</span> <span class="s1">'wb'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
<span class="k">for</span> <span class="n">chunk</span> <span class="ow">in</span> <span class="n">res</span><span class="o">.</span><span class="n">iter_content</span><span class="p">(</span><span class="n">chunk_size</span><span class="p">):</span>
<span class="n">f</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="n">chunk</span><span class="p">)</span>
<span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'done'</span><span class="p">)</span>
<span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'installing miniconda to </span><span class="si">{}</span><span class="s1">'</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">conda_path</span><span class="p">))</span>
<span class="n">subprocess</span><span class="o">.</span><span class="n">check_call</span><span class="p">([</span><span class="s2">"bash"</span><span class="p">,</span> <span class="n">file_name</span><span class="p">,</span> <span class="s2">"-b"</span><span class="p">,</span> <span class="s2">"-p"</span><span class="p">,</span> <span class="n">conda_path</span><span class="p">])</span>
<span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s1">'done'</span><span class="p">)</span>
<span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s2">"installing rdkit"</span><span class="p">)</span>
<span class="n">subprocess</span><span class="o">.</span><span class="n">check_call</span><span class="p">([</span>
<span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">conda_path</span><span class="p">,</span> <span class="s2">"bin"</span><span class="p">,</span> <span class="s2">"conda"</span><span class="p">),</span>
<span class="s2">"install"</span><span class="p">,</span>
<span class="s2">"--yes"</span><span class="p">,</span>
<span class="s2">"-c"</span><span class="p">,</span> <span class="s2">"rdkit"</span><span class="p">,</span>
<span class="s2">"python==</span><span class="si">{}</span><span class="s2">"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">python_version</span><span class="p">),</span>
<span class="s2">"rdkit"</span> <span class="k">if</span> <span class="n">rdkit_version</span> <span class="ow">is</span> <span class="kc">None</span> <span class="k">else</span> <span class="s2">"rdkit==</span><span class="si">{}</span><span class="s2">"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">rdkit_version</span><span class="p">)])</span>
<span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s2">"done"</span><span class="p">)</span>
<span class="kn">import</span> <span class="nn">rdkit</span>
<span class="n">logger</span><span class="o">.</span><span class="n">info</span><span class="p">(</span><span class="s2">"rdkit-</span><span class="si">{}</span><span class="s2"> installation finished!"</span><span class="o">.</span><span class="n">format</span><span class="p">(</span><span class="n">rdkit</span><span class="o">.</span><span class="n">__version__</span><span class="p">))</span>
<span class="k">if</span> <span class="vm">__name__</span> <span class="o">==</span> <span class="s2">"__main__"</span><span class="p">:</span>
<span class="n">install</span><span class="p">()</span>
</div></code></pre></div><span>Tagged with: </span><ul class="tag-list"><li><a href="/tags/tutorial">Tutorial</a></li><li><a href="/tags/codesnippet">Code-Snippet</a></li><li><a href="/tags/colab">Colab</a></li></ul></article></div><footer><p>Made with ❤️ using <a href="https://github.com/johnsundell/publish">Publish</a></p><p><a href="/feed.rss">RSS feed</a></p></footer></body></html>
|