Raymond J. Hintz Associate Professor
University of Maine
Jerry L. Wahl
Bureau of Land Management
Eastern States Springfield, VA
Bureau of Land Management
Division of Cadastral Survey
Bureau of Land Management
Division of Cadastral Survey
While the concept of measurement based cadastral systems in support of land information has been a presented by a variety of authors, little literature has been published which details a truly operational or production system. Geographic Measurement Management (GMM) is a measurement based approach which has been the computational solution to the Bureau of Land Management's Geographic Coordinate Data Base for more than three years.
It will be shown how updated bearings and distances from a dependent resurvey, or coordinates from a control survey source, are utilized with appropriate error estimates in updating the measurement data base.
It will then be demonstrated how batch processes are established to perform a regional least squares analysis of the region and in generation of new coordinates and coordinate reliabilities.
Finally similar batch processes will be used to re-subdivide all sections in the region to quarter-quarter section level, followed by validation processes that the update of the measurement base did not change the previously generated land parcels in the region.
The Bureau of Land Management is responsible for the collection and maintenance of the Geographic Coordinate Data Base (GCDB). This is a collection of all survey records in the U.S. Public Land System which places error estimates on these records, produces coordinates and error estimates of those coordinates, subdivision to the quarter-quarter section level using the rules of the Manual of Surveying Instructions, 1973, and generation of the parcels created by this process.
GCDB measurement type data includes bearings and distances derived from field surveys of land boundaries which are specific to the U.S.P.L.S. (Hintz, et al., 1993). Among these can be township and section lines, subdivision of section, meanders, mineral surveys, grants, claims, and various other forms of non-rectangular (metes-and-bounds) surveys. The other type of "measurement" information used are control coordinates derived from digitizing of map information (primarily U.S.G.S. quadrangle maps), coordinates of corners from a field survey process, or a bearing- distance tie to a corner from an input control station. The control coordinates and bearing-distance information are synthesized via least squares analysis in production of coordinates of all corners in the measurement information.
The coordinates from this process serve as control in all further proportioning and subdivision to the quarter-quarter section level. The special properties of fractional and other unique PLSS sections are automatically estimated by GMM, and verified by a user (Hintz and Wahl, 1994).
GMM enables new survey information to be entered at a any time and thus hopefully provide better coordinates via a new least squares analysis of the survey information. The steps required in subdivision have been stored and do not change. Thus re-proportioning and re-subdivision required due to new controlling corners can occur without user interaction.
One of the processes to be considered is the impact of new data on the production of new, hopefully better, coordinates. In addition the reliabilities of those coordinates should improve as better data is entered to GMM. The ability to re- compute user defined "regions" (multiple townships in size) in a least squares analysis requires the surveyor to make judgments regarding the impact of the new data on coordinates and their quality estimates in a local area (Hintz, et al, 1995b).
The record information has been attributed to varying extent during the data collection and subdivision process. The next step in initial data collection is using these attributes to resolve the legal land descriptions for the created parcels. This process can become quite complex as one has non-rectangular record information such as meanders, mineral claims, tracts, etc. along with the aliquot information.
Geographic Measurement Management (GMM) was initially designed for the data input, coordinate production, and subdivision process for GCDB. Recent additions to GMM have allowed the creation of the parcels and their automatic identification (Hintz, et al, 1995a). Concentration on the improvement of data has also been a recent thrust in GMM development.
The primary measurement data sources in GCDB are survey plat distances and bearings. These values are derived from both original and dependent re-surveys. As in the original surveys, all information is converted to horizontal ground distance and geodetic mean bearing. Ground distances are converted to their ellipsoid equivalents by a variety of procedures which estimate the elevation of those lines. The user has complete freedom ranging from one general project elevation to individual elevations on all corners. Note that the quality of the data does not usually require precise elevations as estimated values do not erode the quality of the distance data.
The year of the survey is usually the easiest procedure for assigning error estimates. Obviously a 1850 original survey can be expected to have larger error estimates than a 1990 re-survey in the same area. The improvement of field survey equipment and procedures over time have improved the quality of the resulting measurements. Knowledge of the terrain can also play a role in error estimation as rugged elevation change lends itself to greater error than flat prairie, especially in the pre-EDM era. Finally, knowledge of the surveyor who performed the work is a critical component of effective error estimation as certain individuals are known in an area for their quality (or poor) survey work.
Groups of measurement data are therefore categorized by, or tagged with, a Source IDentification (SID). A SID describes the data by textual information (name of surveyor, date of survey, and source document) and by error estimates (distance constant error, distance ppm or ratio error, and bearing error). To verify proper SID's have been associated with the measurements it is possible to view the data graphically. Each line represents a distance/bearing measurement, and the color of the line is associated with a particular SID.
Control coordinates come from field survey or digitizing. The error estimate of the control again is an indicator of its quality. The SID concept was not employed for control coordinates, as the control files already had individualized error estimate attached which also tended to indicate the source of the information.
When a new survey occurs in one area, the re-survey information is attached to a new SID with proper textual and error estimate information. This data replaces the older measurements on the same lines based on the assumption that newer information is a better representation of what actually exists in the field. Similarly, new surveyed coordinates of corners will replace older ones which were digitized, or from surveys of lesser quality.
The least squares adjustment output (residuals, snoop numbers, root-mean-square errors, standard error of unit weight, etc.) were used both in initial data collection and in maintenance adjustments to indicate to the user if appropriate error estimates had been assigned to SID's. One of the unique aspects of GMM is the surveyor is placing error estimates on surveys which he had no involvement in, and thus using the adjustment results to "tweak" error estimates seems like a logical approach in the analysis process.
The addition of the new data creates an interesting predicament. The new measurements should improve all coordinates in that general area, and this obviously requires a re-processing by least squares. The question is how much affect will the new data have on coordinates, and how fast will the affect minimize as one moves away from the area where the new data exists.
The affect of new data on coordinates is a function of many factors. A larger change in bearing and distance from the previous values will obviously have a larger affect on coordinates. The error estimates on the new data, and how much those error estimates changed from the previous measurement's error estimates, will have a varying affect. The amount of control in that area, and its error estimates, will have an affect regarding possibly minimizing coordinate adjustments in the opposite direction of the measurement update area. The survey measurements define a network geometry which will vary, and thus affect the way new measurements will establish coordinate updates.
The conclusion that can be derived is no one formula or approach will exactly identify the size of the region which requires re-adjustment. Experience with the process obviously helps the surveyor quantify this, but the concept of constantly improving coordinates in a system is new and thus very little experience can exist in this survey analysis. Even with experience, the surveyor needs a way to efficiently create regional adjustments and analyze if his or her assumptions regarding the expected results if the solution is to be valid.
GCDB is organized logically into individual townships of information such as corner type, coordinates, coordinate reliabilities, record data and error estimates, control (including digitized) coordinates and error estimates, rules of subdivision of section, and identified land parcels. Organizing by township enables easy indexing, archiving, and retrieval of information.
In the infancy of GCDB data collection it was originally thought that a township would also be the region which defines the limits of data analysis. To ensure a seamless "connection" to an adjacent township, once a township was completed any adjacent neighboring townships not completed would require fixed coordinates along the common border. This approach illustrated the build-up of systematic error due to the "fixed township boundary" assumption. As more townships were collected and pieced together in a step-wise fashion, the systematic error due to fixed boundaries simply grew and grew. The alternative is adjusting single townships and letting township boundaries adjust. This, of course, produced the undesirable result of adjacent townships with a common boundary but which did not have the same coordinates in each township for a particular corner.
Furthermore, fixed boundaries for seamless purposes creates optimistic coordinate standard deviations of corners near the boundary. As an example, a quarter corner one- half mile from a fixed boundary will have small coordinate standard deviations due to its local nature to the fixed positions.
The recognition of this systematic error buildup due to fixed township boundaries resulted in the realization that multi-township least squares analysis (region) was required. In this scenario the common boundaries would not be held fixed. After a suitable region analysis takes place those coordinates are imported back to the individual township data structure and those township's sections are then subdivided. The entire process of merging townships, performing the least squares analysis, re-populating the individual townships with coordinates and reliabilities, and subdivision could be performed in a batch process. This process is the same whether it be initial data collection or in later updating maintenance operations in which recent surveys are being added.
In initial data collection, analysis of an individual township was usually performed without a fixed boundary so the systematic error in holding it fixed would not adversely affect the analysis. Therefore, common corners on townships boundaries did not have the same coordinates. To further complicate the problem, the same corner has a different corner identification (corner number/name) in each township in which it exists (Hintz, et al, 1993). This identification convention allows very automated subdivision processes to occur (Hintz, et al, 1994).
The township merge was finally resolved by not basing it upon point identification, and only using coordinates in a relative sense. At the beginning of the merge process, GCDB personnel are asked to input an appropriate coordinate match for the region. The merge process (called FORMLSA) reads a region file consisting of township names and their location on the computer's hard disk. The first township is read in and the township name is attached to the original point identification. When a subsequent township is read in, the bearing and distance (record data) is compared absolutely to existing bearings and distances in the building region. If a match is found, coordinate comparisons are made to see if they are within the user defined tolerance. If no match is found the record data does not yet exist in the building region, and region point identifications are assigned based on the township it is in. If a unique match is found that line already exists in the building region and those points are identified as belonging to more than one township. More than one match results in an error message to the user that the tolerance level needs to be refined. Lines where only one corner is in the region are handled in a similar fashion though that line is a new bearing and distance. A bearing-distance closing to a township line is the best example of this kind of line.
This approach insists that the same bearing and distance for a common line exist in all townships. This is actually a requirement of GCDB, furthermore GMM has efficient ways to ensure this happens, and thus it fits nicely into the region concept. In the maintenance update operation, a previous regional analysis has been performed and thus a corner common to multiple townships will have the same coordinates in all townships. In some ways it appears this eliminates the logic of region forming required in initial data collection. But, in reality addition of new data usually is re-processed at township level first to ensure blunders do not exist when a region if processed. That particular township's exterior will no longer match its neighbors until the region is re-processed.
Series of townships are thus merged into a region data set by a simple ascii listing of township file names that are included in the region. Once the match tolerance is input the region formation, least squares analysis of the region, parsing of adjusted coordinates back to individual township files, and the re-subdividing of all region townships according to previously user defined rules, can be a batch process.
Unless GCDB personnel had the computing power to readjust an entire state every time new data was added, the problem of matching regions still exists just as matching townships used to exist. It is logical a region match will have fairly insignificant problems as the amount of data being analyzed reduces the problem in magnitude. To enhance the region process the buffer concept can be utilized in two ways. Note this process is for coordinate generation only and not reliability (final coordinate standard deviation) generation. The first procedure is simply where the user is confident that creating a fixed boundary will not cause adverse problems. In forming a region one is allowed to add what are called buffer townships. These buffer township's measurement information is not made part of the region analysis. Instead common bearing-distance information is identified as in phase 1, but only the existing adjusted coordinates from the buffer township are brought in as fixed control. Buffer townships are obviously not imported back to their individual township coordinate files as they were part of a previous region adjustment where you desire not to change any coordinates.
The second procedure is an alternative to the first process where the user is concerned about the effects of the fixed coordinates. To minimize the effect of fixed coordinates this second procedure can be thought of as a double buffer concept. The first buffer consists of townships which are brought in for least squares analysis, but their coordinates will not be imported back into their respective townships as they were part of a previous region analysis. The second buffer is the fixed layer of coordinates as identified in the first procedure. The double buffer still creates a seamless GCDB as coordinates of boundary points to the first buffer retain their coordinates as they previously existed, and not from this regional analysis. The assumption is that these boundary coordinates are shifting insignificantly due to the double buffer, and statistically (from the reliability values) it does not matter which coordinates are assigned to the point. The double buffer concept requires regions to have overlap, and the overlap provides a measurement buffer between data being used for coordinate production, and an outer buffer being held fixed which ensures regions are seamless.
The size of regions are controlled by the surveyor. It is a number derived by analyzing your computer's computing resources (speed, memory, and hard disk size), how long you are willing to wait for an adjustment to complete, and how much systematic error due to fixed boundaries that you are willing to tolerate. The computer hardware improvements which have been happening very quickly make larger region analysis less time consuming. The number of townships in a region is also controlled strongly by the number of points in the individual townships. A township with only township and section lines contributes less unknowns (stations or corners) to the region analysis than a township which contains meanders, mineral claims, and other types of special surveys. Since the region adjustment is a batch process, the ability to begin a process (or sequential multiple processes) before leaving work in the evening (or for the weekend) is a very typical procedure. The computer does not need a operator to intervene on the task, so on the next work day the surveyor returns to a completed region adjustment.
The term "kerplunking" has been decided upon as the best one word description of the developed regional approach for generating coordinate standard deviations, which in GCDB terms is called a reliability. Kerplunking is an offshoot of concepts in survey network design. In this design process one defines approximate coordinates for control and unknown stations, and defines what measurements are intended to be made. Included are the error estimates for those measurements. From this information the variance-covariance matrix which directly produces coordinate standard deviations/reliabilities can be derived. The actual values of measured quantities are not required, and the solution is not iterative as final coordinates are not being solved for. This means reliabilities can be generated as a separate process from the least squares analysis for coordinate production.
The next assumption in kerplunking is that measurements not in close proximity to a station do not affect the reliability results of that station. In simple terms, making some measurements five miles from a station, and very indirectly connected to that station, do not appreciably improve the reliability of that station's coordinates. In generating reliabilities you also do not want any artificially fixed coordinates as this will produce artificially small reliabilities in that general area. Kerplunking thus forms a region measurement system similar to region coordinate production without the fixed buffer concept. What is dramatically different is the entire region is not analyzed for reliability generation. Instead a user defines a distance tolerance around an individual township which is large enough that the user feels measurement data outside of that tolerance will not affect the generated reliabilities of that unique township.
The best example is a six mile tolerance. For a particular township the nine surrounding townships will be used in addition to it in generating its reliabilities. Only that middle township's reliabilities get updated. The algorithm then moves to the next township and performs the same task. One distance tolerance is used for an entire region reliability analysis, and thus the reliability generation process is again non-interactive. The test that determines if a large enough tolerance has been used will be the reliabilities of corners common to two or more townships. Their values of reliability in each township should show no statistical difference.
The one township tolerance appears very suitable for region reliability generation and thus nine townships becomes the largest variance-covariance propagation problem. Of specific interest is that a region for coordinate generation can differ totally from the region as defined for reliability generation. The region for coordinate generation is limited only by the computing speed of the program/computer, and thus how long one desires to wait for the least squares to process. The reliability generation is performed a township at a time with buffering data, and thus in theory could generate a state's worth of data in one process. Of specific importance is the batch nature of the processing. Jobs can be set into operation during off hours, and completely updated township information will exist when one returns to work.
During initial data collection the procedures for section subdivision have been defined (Hintz, et al, 1994). This enables all townships which have new coordinates from a region analysis to be re-proportioned in a batch process. The next component of computational analysis which exists is the identification and computation of all intersections of parcel lines which have been created by the initial survey lines represented by bearings and distances, the definition of more lines by the section subdivision process, and the "subdividing" of both initial survey lines and subdivision lines into more lines by operations such as placing a 1/16th corner midpoint between its corresponding 1/4 corners. These intersections normally exist at intersection of special surveys (meanders, mineral claims, etc.) with section subdivision lines. These intersections were automatically identified and stored during initial data collection.
The same intersections will occur in the re-processed data except in two instances which are automatically identified. The first case is when the new survey information included new survey lines such as could occur in subdivision beyond the 1/4 1/4 section level, or if any other new special survey type lines were created in the same way. The second situation occurs when the new adjustment based on new (and hopefully better) survey information causes a displacement of lines such that some of the previous intersections are no longer valid and now different line intersections exist. This situation could occur when special survey locations significantly shift due to the new survey bearings and distances. This condition is very unlikely to occur but must be considered in remote situations. The validation of the previous intersections is followed by validation that the same parcels exist in the reprocessed township. New intersections, old invalid intersections, and new parcels due to new intersections, and possibly new lines, create the rare need for user verification that the new parcels have been automatically assigned their proper legal description.
The ability to input new measurements and assign new error estimates in Geographic Measurement Management has been demonstrated. The re-adjustment for new coordinates in a regional least squares analysis is the first component of the measurement based system. Next the re-generation of coordinate reliabilities has been shown to be independent of the coordinate generation. Finally, the regeneration of subdivided parcels has been defined by the initial data collection procedure and can generally occur, like all previous processes, in a totally batch environment.
This work has been partially funded through the cooperative agreement between USDI-BLM Eastern States and the University of Maine. The authors especially thank cooperative assistance representative Corky Rodine of BLM Eastern States Cadastral Survey.