Ludvig Learns Python

2017-06-26

Stock Screener Update 1

The current status of the program is that it can make a valuation of a company based on the data sheet it's fed with. It doesn't have input for growth yet and it only outputs money making potential in compounded annual return. It's time for an update of the ideas-of-improvements list.

Ideas of improvement

As I was writing the above three steps, an explosion of ideas took place in my head. Although some of the ideas were existing prior to this post, many new came to mind. When I came up with the idea of the program, I wanted to make it come with a GUI, but due to complexity I'll start without GUI.

Time horizon is set by a date instead of a number of years.
Output can be saved as csv.
Output can be saved to a database.
Calculate future value based on return on equity.
Allow user to enter a stock ticker and retrieve data automatically (API or scraping).
Growth rate calculated on historical data instead of arbitrary.
Make a GUI for the program.
Calculation with growth rate.
Checking company's financial health from data sheet.

Some thoughts

I find it challenging to get a picture of which methods should be a part of which class. As I've understood it, it's good practise to divide the code into methods can classes depending on how many times they are supposed to be used and also if they are program specific or if they can be used in other programs aswell. The code can be rearranged in almost indefinite different ways, but there should be some ways that are better than other. I hope to learn this as my Python experience grows.

At the moment, the data sheet is scraped into a list, but maybe I should use a tuple instead? The data sheet info shouldn't be modified in the program.

/Ludvig

2017-06-25

Stock Screener Programming 2

The program now calculates three year average net income and the lowest historical price/book key figure and saves it in a Company object.

main.py


def open_datasheet(file=""):
    file = open(file, "r", encoding="utf-8")
    lines = [[text for text in line.split()] for line in file]
    return lines


def get_earnings(parsedList=""):
    """This method returns the last three year average net income"""
    for line in range(0, len(parsedData)):
        for item in range(0, len(parsedData[line])):
            if "Net" in parsedData[line][item]:
                if "Income" in parsedData[line][1]:
                    if "(Mil)" in parsedData[line][2]:
                        average = (
                                    int(parsedData[line][3].replace(",", "")) +
                                    int(parsedData[line][4].replace(",", "")) +
                                    int(parsedData[line][5].replace(",", ""))
                                  ) / 3
    return int(average)


def get_lowest_pb(parsedList=""):
    """This method returns the lowest historical Price/Book value"""    pb_list = []

    for line in range(0, len(parsedData)):
        for item in range(0, len(parsedData[line])):
            #if "www.nasdaqomxnordic.com/" in parsedData[line][item]:            if "Price/Book" in parsedData[line][item]:
                for x in range(item + 1, item + 6):
                    try:
                        pb_list.append(float(parsedData[line][x]))
                    except:
                        continue
    min_pb = min(pb_list)

    return min_pb


def create_company():
    """This function creates a Company object"""    import company
    new_company = company.Company(companyName=filename,                                  avg_earnings=avg_net_earnings,                                  lowest_pb=lowest_pb,                                  datasheet=open_datasheet(filename))
    return new_company

filename = input("Text file name:")
parsedData = open_datasheet(file=filename)
avg_net_earnings = get_earnings(parsedList=parsedData)
lowest_pb = get_lowest_pb(parsedList=parsedData)

current_company = create_company()

print(current_company.company_name, current_company.avg_earnings, current_company.lowest_pb)

company.py


class Company:
    """The Company() object hold information of a company"""
    def __init__(self, companyName="NoName", avg_earnings = 0, lowest_pb = 0, datasheet=""):
        self._companyName = companyName.title()[:-4]
        self._avg_earnings = avg_earnings
        self._lowest_pb = lowest_pb
        self._datasheet = datasheet

    @property    def avg_earnings(self):
        return self._avg_earnings

    @property    def company_name(self):
        return self._companyName

    @property    def lowest_pb(self):
        return self._lowest_pb

    @property    def datasheet(self):
        return self._datasheet

    if __name__ == "__main__":
        print("Company class is run as main!")

Stock Screener Programming 1

There are a lot of things to do, and since I'm going to have a iterated development approach in this project, my aim for every coding session is to have a runable code which is improved from last session.

For PDF reading I'm using the module PyPDF2. When I try to open the PDF with PyPDF2 I get just nonsense output so I'm going to give pdfminer3k a shot. I found a very new documentation on pdfminer3k HERE. After fiddling around with pdfminer3k also that library was ratified. Next library up for shaving is one called textract. This time the problem was already in the installation of the package. Slate3k is another one to try and it atleast installs successfully. Wow this problem doesn't seem to be an easy to solve. The best solution I find that I don't feel like doing is to install Python 2 and redirect all pdfminer commands to there.

What I will do for now is to save the pdf as a text file manually and read the text file instead. If I come up with a way to scrape PDF files later on, I'll implement it then.


def openDatasheet(file=""):
    file = open(file, "r", encoding="utf-8")
    lines = [[text for text in line.split()] for line in file]
    print("Number of lines read: ", len(lines))
    return lines


def getEarnings(parsedList=""):
    """This method returns the last three year average net income"""
    for line in range(0, len(parsedData)):
        for item in range(0, len(parsedData[line])):
            if "Net" in parsedData[line][item]:
                if "Income" in parsedData[line][1]:
                    if "(Mil)" in parsedData[line][2]:
                        average = (
                                    int(parsedData[line][3].replace(",", "")) +
                                    int(parsedData[line][4].replace(",", "")) +
                                    int(parsedData[line][5].replace(",", ""))
                                  ) / 3

    return int(average)

filename = "pandora.txt" #input("Text file name:")parsedData = openDatasheet(file=filename)
avg_net_earnings = getEarnings(parsedList=parsedData)

print("Average net earnings of", filename, avg_net_earnings)

This program takes a textfile (I've used the data sheet for the Danish jewellery company Pandora), reads it and prints out the average net earnings. Next step should be to create an object Company() that the read data can be saved to.

/Ludvig

2017-06-24

Stock Screener from PDF Data Sheet

I want to build a program that can read data from a PDF file and make some calculations on the data which it will print out. This is to speed up the quantitative analysis made on companies which stocks are traded publicly. The program won't produce pages of processed information to be used as a complete quantitative analysis, but as an advanced stock screener I hope it'll do the trick. The aim is to solve the problem of manual data harvesting for stock screening.

In a book am currently reading, "Lindblad, Erik. 2006. Programmering i Python" (about Python 2, but good enough for a beginner at my level), the author writes that just as in every other scientific area in which I've been in touch, planning the work is just as important as executing it. The author writes about three "steps of OOP".

Object oriented analysis
Object oriented design
Programming

1. Object Oriented Analysis

This step is about identifying the objects which will be used, which attributes they will have and which methods they need. By describing what the program is supposed to do in writted text, the nouns in can be used as objects and the verbs as methods in the code.

"The user uses the program to open a locally saved PDF company data sheet file from which all data is read. Compounded future earnings on an arbitrary time horizon are calculated with an arbitrary growth rate. The compounded earnings are then added to the company's equity so that a figure of what the equity (excluding shareholder paybacks such as dividends and share buy-backs) will be in the given time. By multiplying the future equity by the lowest historical price/book valuation, a future valuation is calculated. The current valuation is calculated and compared to the current price of the stock. The program will tell the user what is the up/down side in the company both in percent and in money."

Objects (nouns)

Company data sheet
Current company

Compounded future earnings
Current equity
Compounded future equity
Lowest historical P/B
Future valuation of compounded equity

Arbitrary time horizon
Arbitrary growth rate
Current stock price

Methods (verbs)

Open a PDF and read data.
Read time horizon from user input.
Read growth rate from user input.
Read current stock price from user input.
Read lowest P/B from data sheet.
Read last three years net earnings from data sheet.
Read equity from data sheet.
Calculate compounded future earnings.
Calculate future valuation of compounded equity.
Calculate current valuation of future compounded valuation.
Calculate up/down side.

This definately grew larger than expected. I'm thinking that "Current company" will be a superclass and the indented points below it will be subclasses of it. The relationship between the superclass and the subclasses will be a strong "has-a" composition relationship... I'll postpone the UML modelling for later.

2. Object Oriented Design

The design step is intended to concretize the model made in the analysis step by setting attribute names and types. Also methods are named and arguments and return types are decided. The relationships between objects are described in detail (maybe UML is good help?). I'm going to make a kind of flow description to visualize the process.

User opens PDF file in program. datasheet = openDatasheet()
A module for PDF reading is imported (PyPDF2).
A method for parsing the PDF data is run. parsePdf(datasheet)

Return last three years net earnings as a list
Return lowest P/B as an integer
Return equity as an integer

Time horizon input is read. time
Growth rate input is read. growth
Current stock price input is read. currentStockPrice
An instance of Company() is created. Company() has five arguments.

earningsList=""
pb=""
equity=""
time=""
growth=""

Calculator(Company()) is a class that takes a Company object and currentStockPrice and returns two values.

Return upDownSidePercent as float
Return upDownSideCash as float

Now that I'm doing all this planning, my mind brings me back to what I've heard and read about iterated development (XP, scrum etc) and that planning is good, but the result probably won't be anything like what was thought from the beginning.

3. Programming

I'm going to write this program in Python 3.6 and I'm going to try to have an iterated process in which I will have a working prototype as often as possible.

Ideas of improvement

Time horizon is set by a date instead of a number of years.
Output can be saved as csv.
Output can be saved to a database.
Calculate future value based on return on equity.
Allow user to enter a stock ticker and retrieve data automatically (API or scraping).
Growth rate calculated on historical data instead of arbitrary.
Make a GUI for the program.

/Ludvig

2017-06-22

Project Euler Problem 67

Since problem 67 was just a "harder" version of problem 18, I figured I'd try my problem 18 solution on it, and voila, it worked!

with open('pe67.txt', 'r') as file:
    numbers = [[int(num) for num in line.split()] for line in file]

numbersRev = reversed(numbers)


for line in numbersRev:
    for item in range(0, len(line)):
        if item < len(line) - 1:
            a = line[item]
            b = line[item + 1]
            c = max(a, b)
            numbers[numbers.index(line) - 1][item] += c

print(numbers[0][0])

2017-06-21

Project Euler Problem 18

with open('pe18.txt', 'r') as file:
    numbers = [[int(num) for num in line.split()] for line in file]

numbersRev = reversed(numbers)


for line in numbersRev:
    for item in range(0, len(line)):
        if item < len(line) - 1:
            a = line[item]
            b = line[item + 1]
            c = max(a, b)
            numbers[numbers.index(line) - 1][item] += c

print(numbers[0][0])

2017-06-19

Project Euler Problem 17

For this problem I want to use a dictionary

This problem took me way too long time to solve. All because of a mistake I made caused by naming my dictionaries too similarly. When I thought I was adding "hundred_and_text", I really was adding "hundreds_text". What made it really annoying is that I made my code output all the thousand numbers in words, which worked great. For troubleshooting I copied the output into excel and counted the letters in the cells which gave the correct answer, but when I used my code for counting, I missed ~300 letters. Next time I'm planning on doing something like this, I'm going to give dictionaries more distinguishable names.

In this program I also worked in multiple files, i.e. I had my dictionaries in a file called "dicts.py" and my main program in "main.py" (mostly because I could).

Dictionary code:

singulars_text = {  # 36    1: 'one',    2: 'two',    3: 'three',    4: 'four',    5: 'five',    6: 'six',    7: 'seven',    8: 'eight',    9: 'nine',}

ten_to_nineteen_text = {  # 70    1: 'ten',    2: 'eleven',    3: 'twelve',    4: 'thirteen',    5: 'fourteen',    6: 'fifteen',    7: 'sixteen',    8: 'seventeen',    9: 'eighteen',    10: 'nineteen'}

tens_text = {  # 46    2: 'twenty',    3: 'thirty',    4: 'forty',    5: 'fifty',    6: 'sixty',    7: 'seventy',    8: 'eighty',    9: 'ninety'}

hundreds_text = {  # 107    1: 'onehundred',  # onehundred    2: 'twohundred',  # twohundred    3: 'threehundred',  # threehundred    4: 'fourhundred',    5: 'fivehundred',    6: 'sixhundred',    7: 'sevenhundred',    8: 'eighthundred',  # eighthundred    9: 'ninehundred',  # ninehundred}

hundred_and_text = {  # 126    1: 'onehundredand',  # onehundredand    2: 'twohundredand',  # twohundredand    3: 'threehundredand',  # threehundredand    4: 'fourhundredand',  # fourhundredand    5: 'fivehundredand',    6: 'sixhundredand',    7: 'sevenhundredand',    8: 'eighthundredand',    9: 'ninehundredand',}

Program code:

import dicts


def number_counter_text():

    tot = 0
    for k, v in dicts.singulars_text.items():
        # print(dicts.singulars_text[k])        tot += len(dicts.singulars_text[k])

    for k, v in dicts.ten_to_nineteen_text.items():
        # print(dicts.ten_to_nineteen_text[k])        tot += len(dicts.ten_to_nineteen_text[k])

    for k, v in dicts.tens_text.items():
        # print(dicts.tens_text[k])        tot += len(dicts.tens_text[k])
        for a, b in dicts.singulars_text.items():
            # print(dicts.tens_text[k], end="")            # print(dicts.singulars_text[a])            tot += len(dicts.tens_text[k]) + len(dicts.singulars_text[a])

    for k, v in dicts.hundred_and_text.items():
        # print(dicts.hundreds_text[k])        tot += len(dicts.hundreds_text[k])
        for x, y in dicts.hundreds_text.items():
            # print(dicts.hundred_and_text[k], end="")            # print(dicts.singulars_text[x])            tot += len(dicts.hundred_and_text[k]) + len(dicts.singulars_text[x])
        for a, b in dicts.ten_to_nineteen_text.items():
            # print(dicts.hundred_and_text[k], end="")            # print(dicts.ten_to_nineteen_text[a])            tot += len(dicts.hundred_and_text[k]) + len(dicts.ten_to_nineteen_text[a])
        for c, d in dicts.tens_text.items():
            # print(dicts.hundred_and_text[k], end="")            # print(dicts.tens_text[c])            tot += len(dicts.hundred_and_text[k]) + len(dicts.tens_text[c])
            for e, f in dicts.singulars_text.items():
                # print(dicts.hundred_and_text[k], end="")                # print(dicts.tens_text[c], end="")                # print(dicts.singulars_text[e])                tot += len(dicts.hundred_and_text[k]) + len(dicts.tens_text[c]) + len(dicts.singulars_text[e])
    # print("onethousand")    tot += len("onethousand")

    print("----- result -----")
    print(tot)

    pass
number_counter_text()

/Ludvig