Documentation

As we have probably all heard before, good documentation is almost as important (if not equally as important) as good code itself. You may have written some elegant and powerful code to solve your problems today, but weeks or months from now, that code may become functionally useless if you forget what it does or how to call it. Python3 users have a special built-in tool at their disposal called docstrings that make documenting functions easy. After going through this module, students should be able to:

Write well-crafted docstrings for all functions
Add type hints to function definitions
Write effective READMEs for a project

Docstrings

Docstrings are special strings that appear immediately following function definitions in our code. They should be surrounded by three double-quotation marks on each side, and they may span multiple lines. For example:

def a_function():
    """
    This is a docstring.
    """
    # code goes here
    return

The above is a valid docstring, but it is not a very helpful docstring. When you write docstrings, at a minimum try to include the following sections:

A short description of the purpose of the function
A list of arguments, including type
A list of returned values, including type

A better template for a docstring (based on the Google Style Guide) might look like:

def a_function(arg1: type, arg2: type) -> type:
    """
    This function does XYZ.

    Args:
        arg1: Define what is expected for arg1.
        arg2: Define what is expected for arg2.

    Returns:
        result: Define what is expected for result.
    """
    # code goes here
    return(result)

The description should be succinct, yet complete. Arguments should be listed by name and the expected type (e.g., bool, float, str, etc) should be stated. And the return result(s) should be listed along with the expected type(s).

Let’s look at one more example using a real function:

def add_and_square(num1: float, num2: float) -> float:
    """
    Given two numbers, this function will first add them together, then square the sum
    and return the result.

    Args:
        num1: The first number.
        num2: The second number.

    Returns:
        result: The square of the sum of input arguments.
    """
    result = (num1+num2)**2
    return(result)

Note

Notice above we are using more-or-less complete sentences with proper grammar.

Next, let’s add docstrings to our ml_data_analysis.py code we have been working on:

models.py

...

def compute_average_mass(landings: list[MeteoriteLanding]) -> float:
    """
    Iterates through a list of meteorite landing objects, adds their masses together
    and returns that sum divided by the total number or landings

    Args:
        landings: A list of meteorite landing objects

    Returns:
        result: Average value.
    """
    total_mass = 0.
    for ml in landings:
        total_mass += ml.mass
    return (total_mass / len(landings))

def check_hemisphere(ml: MeteoriteLanding) -> str:
    """
    Given a meteorite landing's location (latitude and longitude in decimal notation),
    returns which hemispheres those coordinates land in.

    Args:
        ml: A MeteoriteLanding object

    Returns:
        location: Short string listing two hemispheres.
    """
    location = ''
    if (ml.location.lat > 0):
        location = 'Northern'
    else:
        location = 'Southern'
    if (ml.location.long > 0):
        location = f'{location} & Eastern'
    else:
        location = f'{location} & Western'
    return(location)

def count_classes(landings: list[MeteoriteLanding]) -> dict[str, int]:
    """
    ???
    """
    classes_observed = {}
    for ml in landings:
        if ml.class_name not in classes_observed:
            classes_observed[ml.class_name] == 0

        classes_observed[ml.class_name] += 1
    return(classes_observed)

EXERCISE

Write the missing docstring for the count_classes() function above.

Let’s now add a call to the count_classes function in your main() function in ml_data_analysis.py

ml_data_analysis.py

...

def main():
    with open('Meteorite_Landings_Simple.json', 'r') as f:
        ml_data = json.load(f)

    landings = [MeteoriteLanding(**ml) for ml in ml_data["meteorite_landings"]]

    print(compute_average_mass(landings))

    for ml in landings:
        print(check_hemisphere(ml))
        print(count_classes(landings))

if __name__ == '__main__':
    main()

In general, your main() function usually does not need a docstring. It is good habit to write the main() function simply and clearly enough that it is self explanatory, with perhaps a few comments to help. If you do add a docstring to the main() function, you may write a few short summary sentences but omit the Args and Returns sections.

EXERCISE

Open up the Python3 interactive interpreter (Either uv run ipython or uv run python). Import your helper functions from models.py and use the commands dir() and help() to find and read the docstrings that you wrote.

Type Hints

Type hints in function definitions indicate what types are expected as input and output of a function. No checking actually happens at runtime, so if you send the wrong type of data as an argument, the type hint itself won’t cause it to return an error. Think of type hints simply as documentation or annotations to help the reader understand how to use a function.

Warning

In the code blocks below, we omit docstrings for brevity only. Please keep including docstrings in your code.

Type hints should take form:

def a_function(arg_name: arg_type) -> return_type:
    # code goes here
    return result

In the above example, we are providing a single argument called arg_name that should be of type arg_type. The expected return value should be return_type. Let’s look at an example using a real function:

def add_and_square(num1: float, num2: float) -> float:
    result = (num1+num2)**2
    return(result)

Although Python3 does not check or enforce types at run time, there are other tools that make use of type hints to check types at the time of development. For example, some IDEs (including PyCharm) will evaluate type hints as you write code and provide an alert if you call a function in a way other than what the type hint suggests. In addition, there are Python3 libraries like mypy that can wrap your Python3 programs and check / evaluate type hints as you go, provided errors where types don’t match.

README

A README file should be included at the top level of every coding project you work on. Websites like GitHub will automatically look for README files and render them directly in the web interface. Markdown is probably the most common syntax people use to write READMEs. It is very easy to create headers, code blocks, tables, text emphases, and other fancy renderings to make the README pleasant and easy to read.

Note

In this class we ask you to include READMEs in each of your homework folders on GitHub. Each homework is essentially a standalone project, so a dedicated README for each is warranted.

At a minimum, plan to include the following sections in all of your READMEs:

Title: a descriptive, self-explanatory title for the project.
Description: a high-level description of the project that informs the reader what the code does, why it exists, what problem it solves, etc.
Installation: As we advance into the semester our code bases will become more complex with more moving parts. Eventually we will need to start providing detailed instructions about getting the project working plus any requirements.
Usage: The key here is examples! Show code blocks of what it looks like to execute the code from start to finish. Describe what output is expected and how it should be interpreted.

Other general advice includes:

Use proper grammar and more-or-less complete sentences.
Use headers, code blocks, and text emphases (e.g. bold, italics) to make the document readable. There are plenty of tools to preview Markdown before committing to GitHub, so plan to go through several cycles of editing -> previewing to make your README look nice.
Be prepared to include other information about authors, acknowledgements, and licenses in the READMEs as appropriate
Spend some time browsing GitHub and look for READMEs of other popular projects. There are many correct ways to write a README.

Remember, the README is your chance to document for yourself and explain to others why the project is important, what the code is, and how to use it / interpret the outputs. The advice above is general advice, but it is not one-size-fits-all. Every project is different and ultimately your README may include other sections or organization schemes that are unique to your project.

Documentation

Docstrings

EXERCISE

EXERCISE

Type Hints

README

Additional Resources