Processing Terrestrial LiDAR with PDAL

DATE: 2020-12-15

AUTHOR: John L. Godlee

I used a Leica HDS6100 terrestrial laser scanner for my PhD work. The scans are processed initially in a program produced by Leica called Cyclone[1]. I used Cyclone to georeference and stitch together scans from a single plot to create a single shadowless scene. From there they can be exported to .ptx files, each of which can contain multiple scans, each with their own local coordinate reference system. Below is an annotated .ptx file, showing the role of each line:

1: https://leica-geosystems.com/en-gb/products/laser-scanners/software/leica-cyclone

20224  # Number of columns
8615  # Number of rows
482595.121831 8330769.987967 1254.138086  # Scanner registered position in real space (xyz)
-0.990870 -0.134818 -0.000312  # Scanner registered axis 'X'
0.134818 -0.990870 -0.000175  # Scanner registered axis 'Y'
-0.000285 -0.000215 1.000000  # Scanner registered axis 'Z'
-0.990870 -0.134818 -0.000312 0  # 4x4 tranformation matrix
0.134818 -0.990870 -0.000175 0 
-0.000285 -0.000215 1.000000 0
482595.121831 8330769.987967 1254.138086 1
0 0 0 0.500000  # Start of point coordinates
0 0 0 0.500000  # Unreturned pulses
0 0 0 0.500000  
0 0 0 0.500000
-0.000046 0.909775 -1.885635 0.010376  # First returned pulse
-0.000046 0.903366 -1.870834 0.015015
-0.000046 0.895859 -1.853836 0.019165
-0.000046 0.894424 -1.849380 0.020874
-0.000046 0.898849 -1.857010 0.024781

When a new scan starts the same header material will be repeated, but with different values depending on the scanner position.

My end goal was to have a voxelised point cloud with noise removed so that only foliage material remains.

First I needed to split the .ptx file into separate scans, based on the header material in each scan:

#!/usr/bin/env sh

if [ $# -ne 1 ]; then
    printf "Must supply one argument:\n  [1] input.ptx\n"
    exit 1
fi

# Get lines at which to split 
lines=$(rg -n --no-encoding -M 10 "^[0-9]+\s+?$" $1 | 
    sed 's/:.*//g' | 
    awk 'NR%2!=0' | 
    tr '\n' ' ' | 
    sed 's/^[0-9]\s//g')

# Get name of file without extension
noext="${1%.ptx}"

# Split file by scans using array dimension rows in header as line ref
csplit -f $noext -b "_%d.ptx" $1 $lines

Then I converted each scan into a .laz file, with the coordinates transformed according to the transformation matrix in the header material:

#!/usr/bin/env sh

if [ $# -lt 2 ]; then
    printf "Must supply at least two arguments:\n  [1] input.ptx\n  [2] output.laz\n"
    exit 1
fi

# For each argument
matrix=$(head -n 10 $1 | tail -4 | sed -r 's/0\s+?$/0.0/g' | awk -f transpose.awk)

pdal pipeline pipelines/ptx_laz.json --readers.text.filename=$1 \
    --filters.transformation.matrix="${matrix}" \
    --writers.las.filename=$2

Here is the PDAL pipeline:

[
    {
    	"type" : "readers.text",
    	"filename" : "input.txt",
        "header" : "X Y Z I",
        "skip" : 10
    },
    {
        "type" : "filters.transformation",
        "matrix" : "0 -1  0  1  1  0  0  2  0  0  1  3  0  0  0  1"
    },
    {
        "type" : "writers.las",
        "compression" : "true",
        "minor_version" : "2",
        "dataformat_id" : "0",
        "forward" : "all",
        "filename" : "output.laz"
    }
]

Then I merged the .laz files back together using pdal merge file1.ptx file2.ptx ....

Then I voxelised the .laz file, again using a PDAL pipeline. This method creates voxels of 0.01 m^3 (1 cm^3), with a point at the center of each occupied voxel:

[
    {
    	"type" : "readers.las",
    	"filename" : "input.laz"
    },
    {
        "type":"filters.voxeldownsize",
    	"cell" : 0.01,
    	"mode" : "center"
    },
    {
        "type" : "writers.las",
        "compression" : "true",
        "minor_version" : "2",
        "dataformat_id" : "0",
        "forward" : "all",
        "filename" : "output.laz"
    }
]

Finally I excluded noise with a PDAL pipeline. This method measures the mean distance of each point to its eight nearest neighbours, then excludes points with mean distances greater than the 95% confidence interval of the distribution:

[
    {
    	"type" : "readers.las",
    	"filename" : "input.laz"
    },
    {
        "type" : "filters.outlier",
        "method" : "statistical",
        "mean_k" : 8,
        "multiplier" : 1.96
    },
    {
      "type" : "filters.range",
      "limits" : "Classification![7:7]"
    },
    {
        "type" : "writers.las",
        "compression" : "true",
        "minor_version" : "2",
        "dataformat_id" : "0",
        "forward" : "all",
        "filename" : "output.laz"
    }
]