2017-06-25

Stock Screener Programming 1

There are a lot of things to do, and since I'm going to have a iterated development approach in this project, my aim for every coding session is to have a runable code which is improved from last session.

For PDF reading I'm using the module PyPDF2. When I try to open the PDF with PyPDF2 I get just nonsense output so I'm going to give pdfminer3k a shot. I found a very new documentation on pdfminer3k HERE. After fiddling around with pdfminer3k also that library was ratified. Next library up for shaving is one called textract. This time the problem was already in the installation of the package. Slate3k is another one to try and it atleast installs successfully. Wow this problem doesn't seem to be an easy to solve. The best solution I find that I don't feel like doing is to install Python 2 and redirect all pdfminer commands to there.

What I will do for now is to save the pdf as a text file manually and read the text file instead. If I come up with a way to scrape PDF files later on, I'll implement it then.



def openDatasheet(file=""):
    file = open(file, "r", encoding="utf-8")
    lines = [[text for text in line.split()] for line in file]
    print("Number of lines read: ", len(lines))
    return lines


def getEarnings(parsedList=""):
    """This method returns the last three year average net income"""
    for line in range(0, len(parsedData)):
        for item in range(0, len(parsedData[line])):
            if "Net" in parsedData[line][item]:
                if "Income" in parsedData[line][1]:
                    if "(Mil)" in parsedData[line][2]:
                        average = (
                                    int(parsedData[line][3].replace(",", "")) +
                                    int(parsedData[line][4].replace(",", "")) +
                                    int(parsedData[line][5].replace(",", ""))
                                  ) / 3

    return int(average)

filename = "pandora.txt" #input("Text file name:")parsedData = openDatasheet(file=filename)
avg_net_earnings = getEarnings(parsedList=parsedData)

print("Average net earnings of", filename, avg_net_earnings)

This program takes a textfile (I've used the data sheet for the Danish jewellery company Pandora), reads it and prints out the average net earnings. Next step should be to create an object Company() that the read data can be saved to.

/Ludvig

No comments:

Post a Comment