Command line interface

Command Line Interface [kəˈmænd ˈlaɪn ˈɪntɚˌfeɪs]: A command-line interface (short CLI) is a way to interact with a program not by clicking with a mouse on icons, but by typing predefined commands.
Casual users prefer GUIs (graphical user interfaces) because you have to learn the commands for every program you want to use.
Advanced users often prefer CLIs because it supports a more powerful way to interact and automate user interaction.
There are many more programs who only have a CLIs because GUIs need more resources to develop and maintain. Before the 1970s using a computer terminal with commands was the primary mean of interaction.

Back to Dictionary

Fuzzy Search

Fuzzy Search [fazi səːtʃ]: An algorithm to find strings that match patterns approximately rather than exactly.

Fuzzy search algorithms are useful to clean data. They can cluster strings that are similarly written. You can specify the pattern: For example, if you want to extract e-mail addresses, there has to be an @-symbol and a dot in the string. Usually, you provide an input vector of strings – let’s say the names of countries. The algorithm then evaluates a given data frame row by row and looks for strings similarly written to the input vector’s names. Then, the fuzzy search can cluster “Kolumbia”, “Kolumbien” and “Coholumbien” to “Columbia”. Of course, the output isn’t always perfect, especially when two entries in the input vector look similar to each other.

Back to Dictionary

Geocoding

Geocoding [dʒiː.oʊkoʊdɪŋɡ]: The process of converting address data (like postal codes, street names or complete addresses) into numerical coordinates that mark their place on the globe.

Most of the time, this place is represented by latitude and longitude. A geocoder is a piece of software that calculates this conversion. The spatial data is drawn from some framework that links street addresses to their geographical representation, like an OpenStreetMap service or the Google maps API.

You can use, for example, the Excel Geocoding Tool or OpenRefine to geocode your data.

Back to Dictionary

Git

Git [ɡɪt]: a version control system. Developers use version control systems to organize their projects, to update and change code and to store these modifications in a central repository. Others can contribute to that repository and download and upload changes.

Git is Open Source and a command-line tool, which means that it does not have its own user interface. You use it by entering short commands into the terminal/command-line.

Back to Dictionary

GitHub

GitHub [ɡɪthʌb]: a web-based hosting service to upload and share code. Also see BitBucket.

It runs on the Open Source version control system Git. Developers use version control systems to organize their projects, to update and change code and to store these modifications in a central repository. Others can contribute to that repository and download and upload changes.

GitHub is literally a connection of Git and Hub. Git is a command-line tool. It does not have its own user interface. You use it by entering short commands into the terminal/command-line. The Hub of GitHub is the webpage where developers store their projects and network with the GitHub community.

Histogram

Histogram [ˈhɪs.tə.ɡræm]: A visual representation of a distribution of data that is divided into intervals. For each interval the amount of values that belong to it are counted and then represented by the interval’s column’s area.

In most cases, the intervals will be of

The same dataset visualised in histograms with equal sized and differently sized intervals, respectively.

The same dataset visualised in histograms with equally sized and differently sized intervals, respectively.

equal size, so the columns’ height will be proportional to the frequency of values in each interval. It is also possible to use differently sized intervals. However, since the area is scaled – not the height, like in bar charts – these histograms are not as easy to read and therefore used less often.

Back to Dictionary

HTML

HTML [eɪtʃ tiː ɛm ɛl]: short for HyperText Markup Language. This language is used to build the basic structure of a website.

HTML is a type of XML (Extensible Markup Language), which is generally used to store data, like text and metadata, in a structured way.

You can read up on HTML in our tutorial on the basics of web development or on a reference site like W3schools or the Mozilla Developer Network.

Back to Dictionary