Projects, Joseph Jakuta's Blog

ERTAC EGU Code v3.0 Available

June 25, 2022July 4, 2022 / ERTAC EGU / by josephjakuta / Leave a comment

A new version of the ERTAC EGU code, v3.0, is now available. The new version of the code simply ports v2.2 to work with Python v3.x, so this upgrade was more about retesting features than actual code changes. James at MDE was coded the port and I focused on the testing.

The new code base is available on github: https://github.com/bukim1/ERTAC-EGU-Emission-Projection-Tool. I hope others can put the code to good use.

He Knows If You’ve Been Voting

May 25, 2022May 26, 2022 / Projects • r coding / by josephjakuta / Leave a comment

In 2020, I volunteered to write letters for VoteFwd. The idea behind it is that receiving a letter from someone in the leadup to an election can motivate reluctant voters to show up at the polls. Since voting is vital for our democracy to function, I joined numerous other activists in writing such letters, signing up for 20 letters each in Texas and North Carolina. I wrote my letters and sent them off as part of the big send, but I was wondering, did I have an impact?

40 letters was not too large of a number, but it was bordering on the point that some scripting would help me get to the bottom of my question. Each letter that I wrote had a PDF template that was printable that included the individual addresses (though I fully handwrote the actual letters, sorry I though the template was a little tacky). I then took all of the pdfs template for the voters I had written to, saved them in a folder, and wrote an R script to create a csv of the addresses (see code below).

The script produced a nice csv file with most of the information I needed so I imported them into my google contacts. I also labeled them so I could easily find these 40 contacts. Had I known, I would have added a phone number and removed the middle names for each contact as well so I recommend doing this to the csv at this point – more on this later.

The next step was to actually look up whether my folks voted. The reason why I imported the addresses into google contacts is because I wanted to use the phone app Impactive. This app allows you to look up which friends of yours regularly vote or not, and is especially easy to use if you have their addresses in your contacts since it hits voter files. In order to get the contacts to show up though, it was also necessary to have their phone numbers, which is why I added phone numbers after the fact. Since I didn’t actually have the 40 voters’ numbers I just picked a random number with Texas and North Carolina area codes.

And then I waited (March 2021). And waited (June 2021). And waited (September 2021). And waited (December 2021). And waited (February 2022). See it apparently takes a while for the voter file to update. I usually checked myself first before looking for my 40 voters since I know that I voted in 2020.

Finally, I checked on May 7, 2022 using Impactive to see if I voted, and I found out that I had voted in 2020, so I knew it was time to check my 40 voters. This is when I discovered that including the middle names made it harder to find the voters (I hypothesized that it was going to make it easier, but all of the middle names had to be deleted to get Impactive to work). And here are the results.

State	Voted?	Found?	Percent Voted
NC	7	17	41%
TX	11	17	65%
Grand Total	18	34	53%

Unfortunately, I couldn’t find everyone. Some people didn’t show up, perhaps because they had moved, and some had common names and lived in a large city so I couldn’t make out which person was the one that had potentially gotten my letter. I did count one person though, since they had a common name, but all of the folks with the common name voted so I am sure one of them got my letter.

It did appear that I had could have had an impact. Obviously, it is a small sample size and I don’t have a counterfactual, but overall 53% of the 40 voters that I could find voted. This definitely made me think I should sign up for VoteFwd again in 2022 (you can too – https://votefwd.org/campaigns) and write more letters so I have a bigger sample size. I might try to do some with a template too to see if they is more or less effective (it certainly is quicker to use the template).

RScript

library(dplyr)
library(pdftools)

filenames <- list.files(“/filepath/VoteForward Complete”, pattern=”*.pdf”, full.names=TRUE)
addresses <- data.frame(matrix(ncol = 6, nrow = 0))
colnames(addresses) <- c(“First Name”, “Last Name”, “Street Address”, “City”, “State”, “Zip”)

for(f in filenames) {
letter_pdf <- pdf_data(f)[[1]]
letter_text <- as.data.frame(arrange(letter_pdf,y, x)$text)
i <- 0
step <- 0
first_name <- ”
last_name <- ”
street <- ”
city <- ”
state <- ”
zip <- ”

while(i < nrow(letter_text)) {
i <- i+1
current_row <- as.character(letter_text[i,])

if(step == 3) {
   if(substr(current_row, nchar(current_row), nchar(current_row)) == “,”) {
    step <- 0
    city <- paste(city, substr(current_row, 1, nchar(current_row)-1))
    current_row <- as.character(letter_text[i+1,])
    state <- current_row
    current_row <- as.character(letter_text[i+2,])
    zip <- current_row
    addresses <- rbind(addresses,
             data.frame(`First.Name` = first_name, `Last.Name` = last_name, `Street.Address` = street, City = city, State = state, Zip = zip))
   } else {
    city <- ifelse(city == ”, current_row, paste(city, current_row))
   }
}

if(step == 2) {
   if(substr(current_row, nchar(current_row), nchar(current_row)) == “,”) {
    step <- 3
    street <- paste(street, substr(current_row, 1, nchar(current_row)-1))
   } else {
    street <- ifelse(street == ”, current_row, paste(street, current_row))
   }
}

if(step == 1) {
   if(substr(current_row, nchar(current_row), nchar(current_row)) == “,”) {
    step <- 2
    last_name <- substr(current_row, 1, nchar(current_row)-1)
   } else {
    first_name <- ifelse(first_name == ”, current_row, paste(first_name, current_row))
   }
}
if(as.character(letter_text[i,]) == “voting.” & as.character(letter_text[i+1,]) == “For:”) {
   i <- i+1
   step <- 1
}
}
}
write.csv(addresses, “Addresses.csv”)

ERTAC EGU Code v2.2 Available

July 19, 2021July 19, 2021 / Code • ERTAC EGU / by josephjakuta / Leave a comment

It is exciting that the new version of the ERTAC EGU code, v2.2, is now available. I have worked on developing the new features for this open source python codebase, that uses EPA and state data to project future emissions from power plants. It was great to work with partners in several states and regional organizations to then test and evaluate the new code.

Full details of the new features are in a README, but some of the things I am particularly excited about are the so called “HIZG” hours and the new fuel/unit types and state input files.

We have found in previous versions that in projections, emissions from startup and shutdowns don’t get maintained. This because in the hourly Clean Air Markets Data (CAMD) that ERTAC EGU relies upon startups and shutdowns have heat input and emissions, but no gross load, though Heat Input Zero Gross load (HIZG). Since there ERTAC EGU projects by growing or shrinking generations due to supplied changes in demand, hours with no generation were simply dropped from consideration. With ERTAC v2.2 those hours can now be maintained. This not only allows startups and shutdowns to be projected to future years, it also is one of two necessary requirements to allow ERTAC EGU to process so called “non-EGUs,” essentially units in CAMD that do not generate power (e.g., oil refineries, steel mills, pulp mills).

The other major improvement is the ability to add new fuel unit types. The code was originally written to limit the user to five fuel/unit types. However, a new input allows users to add fuel/unit types. This was particularly useful to the group to allow non-EGUs to be processed. It also allows some types of power generators to be projected, such as various biomass facilities that were ignored in the original code. With incorporation of other data sources and extensive use of the “demand transfer” functionality one might be able to include renewables, though this was never tested. Also the ability to override the default state.csv file was included, which could allow this tool to be used to subdivide states (for instance to include EGUs within, and outside of, a nonattainment area in a state) or allow this tool to be used in other countries where hourly power plant data can be found.

This is all quite exciting. The new code base is available on github: https://github.com/bukim1/ERTAC-EGU-Emission-Projection-Tool. I hope others can put the code to good use.

Development of a Fuel Economy Based Vehicle Excise Tax in the District of Columbia

May 19, 2021July 19, 2021 / Code • Presentations • Projects • r coding / by josephjakuta / Leave a comment

Development of a Fuel Economy Based Vehicle Excise Tax in the District of Columbia
2nd Annual Transportation, Air Quality, and Health Symposium – CARTEEH
May 18, 2021
A presentation given to the 2nd Annual Transportation, Air Quality, and Health Symposium – CARTEEH on a project I completed in R.

View the Presentation

Introduction to the r4moves Package – 2021 MARAMA Mobile Source Workshop

March 18, 2021May 25, 2022 / Presentations • r coding • r4moves / by josephjakuta / Leave a comment

Introduction to the r4moves Package

2021 MARAMA Mobile Source Workshop

A presentation given to the 2021 MARAMA Mobile Source Workshop on the development of the r4moves package.

View the Presentation