Friday, April 22, 2016

Network Analysis

Background

With the increased activity currently being experienced in the frac sand mining industry in Wisconsin, "road damage over local roads is a concern for many communities" (Hart, Adams, Schwartz 5). To help deal with these concerns, county governments are implementing ways to help recover the costs of road damage. Chippewa County, for example, is "using road use and maintenance agreements (RUMA) to fund road repairs to frac sand haul routes" (Hart, Adams, Schwartz 19).


Goals and Objectives

The goal of this assignment is to execute a network analysis in order to calculate the impact of transporting sand on local roads from the mines to the rail terminals. Keep in mind, the number of trips and cost in each county per year is strictly hypothetical. 

The objectives for this exercise are as follows:
  1. Using PyScripter, set up script and variables, write several SQL statements to select mines based on certain criteria, use the query statement to run queries, and finally select the mines within 1.5 km from a railroad and remove them from the mines
  2. Practice with and use the closest facility solver
  3. Build a model to calculate the closest facility route
  4. Calculate the hypothetical cost of sand truck travel routes on roads by county
Data sets and sources include:
  • Network Dataset - ESRI street map USA
  • Mines dataset - Wisconsin DNR

Methods

In the first part of this exercise, I wrote python script to select the mines that would be used in the analysis later on (refer to my "Python Script" blog post for the script itself). The mines chosen had to meet the following criteria:
  • The mine must be active
  • The mine cannot have a rail loading station on-site as this would not impact local roads
  • The mine must not be within 1.5 km of the rail since it is then likely that a spur has been built to route the rail line to the mine
After writing the script successfully and producing the mines_norail_final feature class, it was time to move on to figuring out which rail terminal each mine would travel to and which routes were most time efficient. Using model builder, I built a model that would find the most efficient routes between the desired mines and rail terminals, sum up the total route distance per county, and calculate the cost per county per year for the routes.

The steps I took in model builder to get my results are as follows (refer to figure 1 as I describe the steps; start from top left and follow the arrows):

  1. In order to make the Closest Facility model, I had to first add the Make Closest Facility Layer tool and enter the streets layer which came from ESRI street map USA
  2. The next step was to add the mines_norail_final and railterms_final layers for the locations
  3. Once this was done, I was able to run the Solve tool which calculated the most time efficient routes between our inputted locations 
  4. Next, I wanted to turn the routes into its own layer, so I added the Select Data tool to select the routes, then ran the Copy Features tool which produced the routes_copy layer
  5. In order to figure out how much route distance ran through each county, I had to use the Intersect tool to intersect routes_copy and WI_County_Bnds (WI county boundaries)
  6. I needed to get a table that would sum up the total routes and route distances per county, I had to run the Summary Statistics tool where I ran a SUM on Shape_length (route distances) and used the county names as the case field which summed up the number of routes and total distance for each county; this produced the Sum_Output_4 table
  7. To convert the route distance into miles (since it was in meters) and display this in the table, I used the Add Field tool to add a field named Route_Miles then ran the Calculate Field tool where I multiplied the Shape_length by 1609.34 (the number of meters in a mile)
  8. The final step involved adding another field which I titled Yearly_Cost and ran the Calculate Field tool to produce the final cost results in dollars per year; the equation I used for this was Route_Miles * 50 * 2 * 0.022 where '50' represents the number of trips which I then multiplied by '2' to account for the trip round way and the 0.022 represents the dollar amount each truck is charged per mile


Figure 1: Model for calculating closest facility route and cost per county


Results



Figure 2: Sand mine to rail transport routes

Figure 3: Final results chart of county sand mine routes, distances, and cost (per year)



Discussion

Referring to the map in figure 2, we see that many sand mines use the same routes and rail terminals. Some counties may not have very much truck traffic while others have a lot, relatively speaking. The graph in figure 3 helps show this a little better. The "Big 3" counties with high route miles and costs include (from most to least) Chippewa, Barron, and Eau Claire counties, and at the bottom of the list are St. Croix, Burnett, and Winnebago counties.

Looking at both the map and graph, counties (especially the ones greatly impacted) could determine how intensely they should focus on repairing and maintaining their roads, which roads require more attention, and how much they're charging the truck companies.


Conclusion

With the increased level of activity from sand mine trucks causing concerns over road damage to local roads, county governments are attempting to recover the cost of road damage and are implementing a variety of methods to tackle the issue. By analyzing the mine to rail terminal routes, we can calculate a hypothetical cost that counties could charge the truck companies. This could especially benefit the counties that experience high levels of truck activity.


Cited

Hart, Maria V., Teresa Adams, and Andrew Schwartz. Transportation Impacts of Frac Sand Mining in the MAFC Region: Chippewa County Case Study. Mid America Freight Coalition. 2013. Accessed April 10, 2016. http://midamericafreight.org/wp-content/uploads/FracSandWhitePaperDRAFT.pdf.

Friday, April 8, 2016

Data Normalization, Geocoding, and Error Assessment

Goals and Objectives: 

The goal of this exercise was to geocode the locations of assigned sand mines in Wisconsin and then compare our results to the results of our fellow colleagues and the actual mine locations provided by the Wisconsin DNR (WiDNR).

The objectives for this exercise were as follows:

1) Normalize the mine data in the Excel table provided to us
2) Sign in to ArcGIS Online, connect to the geocoding service, and geocode the mines
3) Connect to the our department ArcGIS server and add the Public Land Survey System (PLSS) feature classes (townships, sections, quarter sections, quarter-quarter sections)
4) Using the PLSS feature classes, manually locate mines with PLSS locations
5) Compare my results with the results of my colleagues and the WiDNR


Methods: 

The WiDNR provided sand mine data in the form of an excel table. My colleagues and I were assigned about sixteen mines each that we were to geocode. In order to use the data to geocode the mines, the data table needed to be normalized. After normalization, there were two different methods used to geocode the sand mines based on the information that was provided by the WiDNR. If an actual street address was provided, then the address could be geocoded using the geocoding service made available by ArcGIS. If only a Public Land Survey System (PLSS) address was given, then the PLSS feature class data needed to be added to ArcMap and the address had to be searched for manually and geocoded in that way.

1) Normalizing the data
  • The first task that needed to be done was to normalize the data table. Basic normalization rules include fields needing to be properly formatted and each column should contain only text or or numbers. The 'Address' field contained either a street address, a PLSS address, or both. I took this field and separated it out into individual fields that included 'PLSS', ' Street Address' (contained full street address and is the field I used for geocoding later on), 'Direction', 'Street Number', 'Street Name', 'Street Type', 'City/Town', 'County', 'State', and 'Zip Code'. After the table was normalized, it was ready for geocoding. 
2) Geocoding using street address
  • First, in order to access the geocoding service, I had to log onto ArcGIS Online and then turn on the geocoding toolbar in ArcMap. Using the World Geocode Service, I selected my normalized sand mine data table as the address table which brought the data from the table in as a shapefile and geocoded the addresses from the 'Address' field. From here, I opened the interactive rematch inspector window and checked the status and match type of each address. To confirm the matches, I added an imagery basemap in ArcMap that my geocoded sand mine shapefile would show up on, and I used Google Maps to double check the addresses and made changes to the addresses entered into the rematch window where necessary. For tied or unmatched addresses, I zoomed to each candidate and chose the address myself by using the 'Pick Address' option. 
3) Geocoding the PLSS addresses
  • Mines with only PLSS addresses had to be located and geocoded manually. In order to accomplish this, I had to first connect to the UWEC geography department server so I could access the PLSS data found in the WiDNR2014 database. I then added the plss_townships, plss_sections, plss_quarter_sections, and plss_qq_sects shapefiles. Using the PLSS addresses in the data table and the PLSS data layers in ArcMap, I narrowed down each address to each respective quarter-quarter section and searched within that area for anything that resembled a sand mine and then picked an address point closest to it. 
4) Comparing the geocoded mines
  • After geocoding my mines and exporting the data to the share folder, I then accessed my colleagues' geocoded mines (the ones that I shared mines with) to compare our results. I also compared my results to the actual mine locations provided by the WiDNR. For each of my colleagues' sand mines feature class, I queried each attribute table to select only the mines that matched up with my own mines, then created a layer from the selection and added this data to ArcMap. From here, I used the 'Near' tool to calculate the distances from my mines to the other geocoded mines. For the actual mine locations, since there was not an attribute table associated with the feature class, I had to use the 'Measure' tool to measure the distances between the mines. As I went along, I entered the distance differences into an excel table for comparison. 

Results:

Figure 1 (see below): WiDNR sand mine data before normalization
Figure 1
Figure 2 (see below): WiDNR sand mine data after normalization
Figure 2
Figure 3 (see below): map of my geocoded sand mines in Wisconsin
Figure 3
Figure 4 (see below): error/comparison table of my geocoded mines, my colleagues' geocoded mines, and the actual mine locations
Figure 4
Discussion:

Errors:

In this exercise, the two types of errors that appear are gross and random errors where gross errors are easily detected mistakes while random errors are caused by the inability for both instruments and the human operator to make perfect measurements. These errors could have occurred for a number of reasons including:
  • The data is difficult to work with (sand mine table not normalized; some sites only had PLSS addresses)
  • The addresses did not properly geocode to the right location
  • Manual geocoding of PLSS mines were inaccurate (process of locating addresses using PLSS data in ArcMap led to errors; image analysis in ArcMap, Google Maps, and any other mapping platform used to locate a sand mine visually led to errors)

Types of Possible Errors:
  • Image Analysis (Operational): This applies mostly to locating the PLSS addresses manually where the operator first had to locate the correct quarter-quarter section in ArcMap which, in itself, could have caused errors. If the correct section was found, it was up to the operator to locate the correct mine based on visual features alone within that area. This leads into the next possible source of error.
  • Aging of Maps (Inherent): Mine site are being established and closed and reclaimed constantly. If maps are not being frequently updated, it's possible that when trying to locate a sand mine site visually, the operator may not find it because according to what is on the map, it is not there. 
  • Attribute Data Input (Inherent and Operational): The data provided by the DNR could have been inaccurate in some places and it was also not normalized. Mistakes could have been made by the operator when the table was then normalized. 
Since the WiDNR provided coordinates for mines, we know where the actual correct locations are, and we can also compare our results with that of our colleagues.

Conclusion: 

Referring to the error table (figure 4), the average total error distance totaled 15.56 miles, which seems somewhat high for this exercise, but this is also due to there being a few outliers that pushed this average up much higher (249, 120.9, 95.2). This proves just how easy it can be to make mistakes when working geographic data and shows the importance in producing accurate and precise data. Data produced by anyone (including yourself) should not be taken at face value and should instead be checked and rechecked and compared with other data in order to reduce as many errors as possible.