Posts

  • Install latest version of Python on Ubuntu
    When using Linux devs and data scientists often end up using the default Python version included in the package repositories. This can lead you to having to wait for a long time to try out Python’s new features! The following post describes how to compile and install an extra Python version without interfering with the system Python and creating a virtual environment to use the new Python version.
  • Install Jupyter extensions
    Jupyter extensions are a great way of increasing your productivity when using notebooks. In this post I will show how to install them and a configuration tool, as well as include information about some of my favourites and my personal configuration file.
  • PySpark - create DataFrame from scratch
    These snippets show how to make a DataFrame from scratch, using a list of values. This is mainly useful when creating small DataFrames for unit tests. Imagine we would like to have a table with an id column describing a user and then two columns for the number of cats and dogs she has.