Self Hosted NTP (GPS + PPS)

Hi all,

Around 12 months ago I had a play with some u-blox NEO GNSS modules and had a proof of concept working using a Raspberry Pi and a 7M based module over TTL as a time source for an NTP server. I can’t recall the exact sites I was looking at when researching and messing about with them but most of them were similar to this one. It was a relatively straightforward way to get extremely accurate NTP from the GPS locked + PPS signals and serve the rest of the network at home.

A few days ago I pulled the 8M based module (RS232 NaviSys GM-8013R) out of the parts bin and connected it up in such a way that it was drawing +5V and GND from the a x86-64 PC’s USB port, and feeding the TX/RX/PPS (DCD) signals back through the same PC’s serial port (with no ground connected to the RS232 port). As a proof of concept it worked very well, and in the next day or two I’m intending to grab a few breakout DB9 and USB connectors to wire it up a bit more solidly and with an extended cable so I can place the antenna in a better position. Ultimately I’m intending to put the module in a weatherproof case outside, and run the cable back inside to connect to the USB and DB9 serial connectors of a server which will provide self hosted NTP.

I have very little electronics knowledge and was wondering if I’m going to run into issues if I power the module from an external USB supply or whether I should stick to drawing USB power from the same PC so that both connectors are referencing a common ground. I’m also wondering if I should be adding one or more fuses somewhere (whether or not I power it externally), or tying/not tying the grounds together.

Has anyone gone down the self hosted NTP route, or can provide some insight into the electronics side of what I’m doing? I don’t particularly want to have another machine (i.e., a Pi) doing just NTP when I can add it to an existing server, nor do I want to use a USB based GNSS module as their timing accuracy is poor. The USB power + RS232 signal approach seems too good to be true… what I have I missed?

Cheers,

Belfry

Update 26/2: I pondered it overnight and will add an optocoupler between the RS232 side, tie the RS232 signal ground to the supply ground, and power the whole lot from a 5V wall wart (with a fuse in line). Hopefully in a week or so after the goodies arrive and I have everything connected up, I can report back with the NTP accuracy stats. The Pi version last year was excellent, so I’m hoping for similar results with this more permanent implementation!

Kewl project, Belfry, but beyond my capabilities. This sounds more like Terry’s stuff.

I am still stuck on “at the third stroke it will be 14.28.00 precisely”.

“Precisely” has acquired new meaning in the last 40 years.

Pip… pip… pip

I unexpectedly had some free time this afternoon to tidy up my initial thought bubble and think it through a bit more thoroughly.


I’ve got 10m of two core cabling (+5V and GND) and 10m of four core cabling (TX/RX for GPS data, PPS pulse tied to the DCD pin, and GND - currently not connected as it’s the same as the USB ground) running to the unit. The power and signal grounds are tied together at the GNSS module end (a horribly fiddly JST connector), and the power cables and signal cables break off separately so I have the option to power via a separate PSU (probably USB wall wart or repurposed plug pack from something else) and run the signal cables in a different direction to the rest of my homelab gear. The lone white cable to the top is power on/off and comes out to a jumper on its own in case I need it at some stage, but it’s otherwise not used. It’s not connected as the NaviSys datasheet says I can leave it floating, and I have no need to remotely turn the module on or off.

It’s currently connected to a [sacrificial] Pentium 4, selected solely in case something went wrong with my wiring and the magic smoke came out! GNSS fix was virtually instantaneous, even indoors. I briefly had it plugged into a Windows 7 machine to run the u-blox software and get some diagnostic data out which showed it was quite happily seeing both GPS and GLONASS satellites from a poor location indoors. The software side was very simple and here’s a very rough sequence of steps to retrace my steps, rather than a full and comprehensive step by step instruction set:

  • install gpsd gpsd-clients pps-tools chrony
  • cat /dev/ttyS0 (to check I was getting GPS NMEA data)
  • edit /etc/default/gpsd to add DEVICES=“dev/ttyS0” and GPSD_OPTIONS=“-n” (because we want gpsd to immediately start and not wait for a client to connect)
  • restart gpsd service
  • cgps (to check GPS data was coming in via gpsd and that the GPS had a fix)
  • ppscheck /dev/ttyS0 (to see if the PPS data was being sent/detected by gpsd)
  • edit /etc/chrony/chrony.conf to add “refclock SHM 0 refid GPS” and “refclock SOCK /var/run/ttyS0.sock refid PPS” lines (add GPS and PPS sources to chrony)
  • restart chrony service
  • restart gpsd service


It looks as if I may need to fiddle with the GPS offset to bring it back into tolerance, but the PPS reference is incredibly accurate straight out of the gate. It’s interesting to see the difference in the random public NTP servers chronyd is referencing vs. GPS vs. PPS sources (note the units of measurements, milliseconds vs. microseconds, vs nanoseconds).

Will refine the hardware side of things over the coming weeks as I have time to come back to it and after my optocoupler arrives. Straight out of the gate 0.000000002 seconds slow and being down to nanosecond level measurements is pretty accurate in my mind!

Pip… pip… pip

:exploding_head:

What you’re doing is very impressive and way beyond me too @Belfry .

For perspective, i recently set up a Home Assistant Glow to log household power consumption. To supply it with power i bought a USB power bank from Myer, and a solar blanket from BCF.

So, suuuper consumer end of the spectrum over here.

Could you run NTP in a container on your home network? How inaccurate would that be? As a protocol, it’s perpetually correcting it’s time in relation to it’s upstream authority isn’t it?

Unrelated - but whoa, thank you for the top tip on the Glow. At a glance that unit may solve a very real problem I have - I currently have a slightly janky method of pulling data from the solar inverter and its integrated energy meter to feed into the HA energy dashboard. Solar data works great, but the consumption data never has worked reliably via any interface I’ve tried. This data is the final piece of the puzzle to use the “excess” solar to turn heating/cooling and other appliances on/off based on presence, environmental variables, and energy use, so I will definitely go down this rabbit hole, thank you!

I’m still learning about all of this, but my limited understanding is that at this sort of level even running it in a VM can cause timing problems. Additionally, the general consensus is to avoid USB because the latency and jitter introduced by USB polling are enough to make things less accurate than just using a public NTP server. I’m doing a new build (well, new to me - second hand parts) 1U Proxmox server to replace a few odds and ends like Pis and other low end computers running for services such as Home Assistant and SearXNG. I really wanted to stick with RS232 and PPS specifically as that’s really the “gold standard”. I’m hoping that the new Proxmox server will be able to reliably run a basic high priority VM with nothing other than GPSD+chrony on it and RS232 hardware passed through Proxmox for the PPS signal. I’m still waiting for the RAM for the new machine (and waiting for the optocoupler so I don’t accidentally fry the new machine) and then I’ll replicate the same setup with RS232 passthrough and maxed out CPU Units (priority) in Proxmox. If that doesn’t work well, I might use one of the newly retired smaller machines for my first foray into an RTOS and play with the time stuff that way.

As an aside, after my post I set DHCP option 42 in the router and can see chrony’s status listing all the clients on my LAN using the new locally hosted PPS NTP time source, which is quite neat.

Cheers,

Belfry

I have a decades old external GPS sensor which has a magnetic base and USB. It was made for use in automobiles where it sticks on the magnetic roofs.

I used to use it in a Robinson R22 helicopter with a laptop to plot WiFi apps on a Google Map from the air flying around Lismore to Ballina with one of my 12dB panel antennas screwed to the base of the craft.

This way I could see which house had a WiFi AP and which didn’t.

Ahh the old WiFi war flying days, the stories I have!

Anyway … while following your thread I had my misgivings about the superiority of serial comms vs USB as I know both fairly well having engineered them in my embedded designs.

Because this is the modern age where we have running water, fire and AI everywhere I’ve used one of my tools to provide a more nuanced comment :wink:

tp@ubuntu:~$ ollama run --verbose qwen2.5-coder:32b

>>> does USB polling the external GPS sensor make a NTP server less accurate than using serial comms ?

The accuracy of an NTP (Network Time Protocol) server can indeed be 
influenced by how it receives time data from an external GPS sensor, but 
whether USB polling is inherently less accurate than using serial 
communications depends on several factors.

### Factors to Consider:

1. **Latency:**
   - **USB Polling:** When a GPS sensor is polled over USB, there might 
be some latency introduced due to the polling interval and the time taken 
for data transfer from the USB device to the host system.
   - **Serial Communications:** Serial communication can often have less 
overhead compared to USB in terms of setup and data transfer. However, 
this also depends on the specific implementation and baud rate used.

2. **Polling Interval:**
   - If the polling interval is too long (e.g., every second), it could 
introduce a noticeable delay in updating the time from the GPS sensor.
   - Serial communication can be configured to send data continuously or 
at a very short polling interval, which might reduce this latency.

3. **Data Transfer Overhead:**
   - USB can handle high data transfer rates and is generally robust. 
However, it requires drivers and possibly additional system resources for 
handling the USB protocol.
   - Serial communications involve less overhead but are typically slower 
than USB unless using a very high baud rate.

4. **System Load and Real-Time Processing:**
   - The overall performance of the NTP server can be affected by how 
well the host system handles time-critical tasks. If the system is 
heavily loaded, handling USB communication might introduce more delays 
compared to serial communication.
   
5. **Protocol Overhead:**
   - Both methods involve some protocol overhead for data encapsulation 
and transmission. The specifics of this overhead depend on the protocols 
used (e.g., NMEA over serial vs. a custom or existing protocol over USB).

### Practical Considerations:

- **Stability:** Serial communication might be simpler in some cases, 
which can lead to more stable performance.
- **Ease of Implementation:** USB is often easier to set up and manage, 
especially on modern systems with good driver support.

### Conclusion:
In general, using serial communications could potentially offer lower 
latency and overhead compared to USB polling, but the actual impact on 
NTP accuracy would depend on the specific setup and configuration. If the 
USB polling interval is short enough and the system can handle the USB 
data efficiently, there might not be a significant difference in accuracy 
between the two methods.

If high precision is critical, it's often best to benchmark both setups 
in your specific environment to determine which method provides the most 
accurate timekeeping for your NTP server.

I concur!

Cheers,
Terry

My optocoupler arrived this morning, but unfortunately it didn’t pass through all the pins, and one of the ones it didn’t support was the PPS pin I really needed!

I bit the bullet and hooked the cabling up to the new Proxmox server in the same configuration as the test P4 machine. The Proxmox server is running a single VM (Debian 12) with the serial port passed through and a dedicated NIC passed through via IOMMU (Using this guide).

After a ton of fiddling with configs on both the host and VM, I can’t get close to the same outcome via a VM as I had on my test setup. We’re talking microseconds (millionths of a second) via a VM versus nanoseconds (billionths of a second) on the bare metal P4. It’s still very good - compare with milliseconds via NTP (thousands of a second) - but this does demonstrate just how sensitive to latency things are when down this far in the weeds!

I’ll continue to have a play and possibly move it to an LXC container later today to see how the latency compares with a full VM. I suspect that I’m going to have to move the GNSS sensor back to bare metal hardware if I want to keep that nanosecond level of precision.

That’s amazing! I was quite impressed about 20 years ago when I saw someone “warstanding” with a 24dBi grid pack antenna on top of a hill and panning it across the city. You definitely win with “warflying” :joy:

Without any particular knowledge of this topic I am not surprised that the bare metal beats a virtual machine. Emulation will have a cost.

I would be interested to hear how your container goes but I suspect it cannot beat spinning metal or probably better yet SSD.

I could get time synchronised within approximately ±8000ns using an LXC container, compared to tens to hundreds of microseconds on a VM. For the sake of comparison I installed gpsd/pps etc. directly within the Proxmox shell and configured chrony to use those sources, and had similar results to the bare metal P4 setup yesterday. The overhead being introduced by LXC or running in a VM was too much to overcome, despite me giving it the best chance of success (setserial low_latency to prioritise interrupts, hardware passthroughs, maxing out cpu governors etc.). Hardware for all tests was a Xeon E3-1220v2, 8GB RAM, 4x 256GB SSDs in a RAID10 ZFS pool.

Interesting results and fun to have a play with for a few hours, but I’ll have to ponder the way forward from here. Perhaps it’ll all go back into the parts bin, perhaps I’ll repurpose something else (e.g., an older embedded system) to do dedicated NTP, or perhaps I’ll tinker some more and try to reduce the latency but I suspect there’s not much more that can be done to reduce millionths of a second down to billionths of a second!

It sounds like you have exhauted all the possiblities, Belfry, but I did wonder if the linux kernel you were running under Debian was 6.12 or later which was when the PREEMPT_RT was added to make Linux a RTOS. You do still need to install linux-image-rt-amd64 to get there however.

Great suggestion! I know Proxmox can be really fussy about kernels, ships its own version of the kernel, and explicitly tells you not to use the Debian ones. I’ll see if I can shoehorn the rt one > 6.12 onto there (and upset Proxmox) to give it a go before I reinstall the machine and put it into production.

Edit: Looks like that’s a bit of a non-starter unless I go down the route of compiling it myself. Consensus on the Proxmox forums is not to use anything but the pve kernels, that among the issues I’ll have is that it will break ZFS (so I’m going to have to reinstall the machine anyway), and the latest Proxmox kernel in testing as of today is 6.11. I haven’t got the time to sink into compiling a kernel now but if an opt in package for 6.12 pops up in the next few weeks before I have to set the machine up properly, I’ll give it a go. Thanks for the idea!

Circling back to this one with an update. About a week ago I got finally got the new (second hand) Proxmox server properly set up in the rack at home and started gradually virtualising a few Raspberry Pi and “Pi like” devices onto the machine. One of my goals with this server was to optimise for low power (as in electricity, not MHz) and optimise for low temperature as it gets quite toasty in that part of the house during summer. As part of this I’ve swapped the four core E3-1220v2 (base 3.1GHz, 69W TDP) with a two core E3-1220Lv2 (2.3GHz, 17W TDP) and pegged it to the powersave cpu_governor.

I had some issues after moving an ADS-B receiver over to a VM a few days ago due to MLAT being out of sync. No doubt the VM latency and the host’s lower power optimisations have partially contributed to this. As multilateration is quite timing sensitive, it sent me back down the documentation + forum reading rabbit hole of trying to optimise latency sensitive workloads on Proxmox (e.g., IOMMU, passing through devices vs. passing through entire controllers, prioritising workloads with cpuunits, “Use tablet for pointer = no”). I didn’t end up mucking around with the kernel in the end and am using the default kernel that comes with Proxmox VE 8.4.1 (6.8.12-9-pve). Of course, once I got MLAT on the ADS-B feeder working and stable, the next logical step was to get the GPS+PPS module rigged up again and see how it performed.

On top of the above optimisations, I used setserial low_latency on the Proxmox host and my time server VM (Debian 12 with nothing but gpsd, chronyd, and setserial added), lowered the RS232 baud rate to 9600, gave it 5000 cpuunits, and set the affinity to a single core. Highly likely that all of those things were unnecessary, but I’m happy to report that the timing has stabilised at <1µs based on the PPS source and is often around ±100ns. It’s not quite “bare metal” accuracy, but far more than I would ever need for home! The wiring definitely needs a clean up and I’ll have to mount the module into a box as well as solder proper connectors on to the RS232 and USB cables. The module quite happily got a GPS fix inside the garage underneath a two story house so I’ll be mounting it inside somewhere rather than worrying about weatherproofing.

I’ll keep an eye on it for a few weeks while I’m setting up the rest of the VMs, and if it’s stable will roll it out as a DHCP option for local NTP across my network. Once the last of the VMs are set up and the wiring is tidied up, I might experiment with the cpu affinity etc. to try and determine which of those variables has the biggest impact and whether they were all necessary after all.

Cheers,

Belfry

Sounds very technical, Belfry. Is the end point flight tracking?

Playing around with the GNSS module for precision timing wasn’t really related - although having a good local time source certainly won’t hurt!

Flight tracking / feeding (pushing data to FlightRadar24, FlightAware, AirNav Radar, etc.) is actually quite simple overall and one could get started with nothing more than a cheap TV tuner. I got into it seriously a little over five years ago and it’s now become more of a jumping off point into other projects and experimentation. For example, testing low powered/embedded systems (and having a known workload to throw onto the machines), setting up multiple geographically diverse sites (and networking them together with tunneling/routing/firewalling considerations that go with that), trying out different gain settings/antennas/cables/filters (and seeing how the reception range compares). I find it easier to learn about some new tech if there’s a specific goal I’m trying to achieve at the end of it.

Getting quite off topic from the time stuff, but I am happy to discuss it further in another thread or at a meeting if there’s interest. It really is simple to get started though, and FlightAware have some great instructions and an Australia centric shopping list here.

A bit of a cheeky thread bump and a project update for the group.

Since April, that GNSS module has been dangling precariously near a window in the garage, hanging by the tiny little JST connector, with a long run of cable taking the RS232 signal via a hole drilled in the plasterboard, and into to the serial port of the server discussed earlier in the thread. It’s still running on the same Debian VM with the same optimisations I was playing with back in April.

It’s fine from an NTP standpoint, but a bit rubbish from an overall implementation one. I was never 100% happy with the VM based setup and it was something I intended to come back to at some stage.

I recently came across the fantastic looking GNSSTimeServer project on GitHub. Around the same time I saw the Waveshare ESP32-S3-ETH pop up… and one of the example implementations of the GNSSTimeServer uses the same ESP32 + W5500 Ethernet combination, so I bought the ESP32-S3-ETH board and PoE module.

I had intended to (and spent a lot of time attempting to) get the project running on the ESP32-S3-ETH board, but ultimately didn’t understand a lot of the mechanics well enough to get it going. It was my first foray into PlatformIO, my first project using a bare ESP32 dev board (rather than starting from something really well packaged up by the community, such as Meshtastic devices), and even my first look at C++. I couldn’t seem to get the W5500 to respond, despite remapping pins, switching between ETH.h, Ethernet.h, Ethernet2.h, and so on. I’m sure someone who actually knew what they were doing could get it running, but I was unsuccessful. That project truly looks awesome and was the main inspiration for me to start tinkering again. It’s far more feature rich than I need as well, as I only need “insert cable, receive NTP”. So my inability to get the GNSSTimeServer project working on that board was a great opportunity to try and get something working by myself!
Wisdom


Over the past few weeks, I’ve started with a sketch in the Arduino IDE, and have gradually been iterating from:

through to DHCP working and having some basic IPv4 connectivity

through to “it keeps time” and a basic HTTP server

through to a bad but functional NTP server (for anyone playing at home, yes, that’s 20438 days out of whack. The NTP server was deliberately dishing up Stratum 15 replies of Unix Epoch during my proof of concept).

.

Lots of opportunities to learn along the way, including, but not limited to:

  • while (!Serial) { ; } as part of initialising the serial bus to debug… which meant that every time I had it plugged into debug it was working, but as soon as I had serial unplugged it’d just “do nothing” :woman_facepalming:.
  • DHCP lease obtained, but no DHCP renewal logic. Everything worked for a while but then wouldn’t :woman_facepalming:.
  • DHCP renewal in the wrong part of the program loop which meant that it was renewing the DHCP lease roughly every millisecond :woman_facepalming:. (Happy to report that the MikroTik router didn’t even flinch at such stupidity).

Hardware wise, I had intended to use the same NaviSys GM-8013R (the one dangling in the garage) and some MAX232 level shifters to bring the RS232 signal down to TTL for the ESP32 board. I decided against this as I figured that the level shifting would introduce some sort of latency. I couldn’t find a definitive answer and it’s probably negligible, but if I’m already mucking about at the microseconds level (millionths of a second) then even 1-2μs is going to matter. So, I picked up an XC3710 from Jaycar, which is essentially the same thing (based on an u-BLOX NEO 7M compared to the GM-8013R’s 8M) and a module I’d worked with a few years ago, until accidentally connecting +5V to somewhere where I shouldn’t have connected +5V :woman_facepalming:.

The module worked great, was dead easy to hook up, and I could see the (essentially empty) NMEA data coming through as soon as I’d configured some of the GPIO pins in the code.


It sat near the window for hours but couldn’t get a decent fix, so we went into the yard for a few minutes until it had a complete almanac. Since then, it’s been solid inside. I always forget how excruciatingly long it takes when firing up one of these modules inside, and how quick it is when you just take the whole lot outside for a minute or two!


So, after probably hundreds of iterations of gradually adding pieces to my building blocks of code and stitching them together, this is what I ended up with.


Insert cable → receive NTP. Exactly what I wanted. My PoE switch reports that it’s drawing 1.1W. I added some LEDs to give me some status indication on the GNSS lock and PPS 1Hz signals. Next steps will be continue to tidy up the code, remove all the Serial.println debugging code, probably remove the HTTP server entirely (as the LEDs give me the info I care about), and see if removing any of that unnecessary stuff from the program loop reduces the jitter at all. I also want to introduce some sort of rate limiting (e.g., a single IP can’t make more one NTP request every second), once I figure out how to do that.

At some stage next year, I’ll button it all up into a case of some sort, but for now it can keep pulsing away in the cardboard box on the corner of my work bench.


Finally, I wanted to loop back to thank the HLB community for your valuable discussions throughout the year. I came back to this project because:

  • Energy monitoring discussions highlighted to me that I’m running a power hungry Proxmox server to do a lot of “stuff” that could easily be done on lower power modules.
  • Meshtastic and ESPHome experimentation and discussions (and de-bricking :woman_facepalming:) nudged me a little towards the idea of working with a bare ESP32 dev board on a project from scratch.
  • A handful of AI LLMs we’ve been tinkering with helped with some of the debugging, both in a “rubber duck debugging” sense, and also to sharpen up the program loop towards the end of the project (e.g., move the PPS processing to one core of the ESP32-S3, and leave the rest on the second core of the microcontroller)
  • Source code was actually being sensibly managed rather than being chucked into a random folder (my usual method), and after a significant tidy up I’ll probably look into some code review programs to try and assemble the spaghetti together into something a bit more sensible.

These elements (and I’m sure other parts too) all directly stem from discussions we’ve had throughout the year either in person, on this very Discourse, or on Jitsi. Cheers to the group and all the input you’ve had on my projects and homelab, probably without realising it!

A few months have passed, so it’s time for another self hosted NTP update (I can hear the collective groan from HLB from here :joy:).

The ESP32 based NTP server was running okay, but needs a bit more debugging. I was relatively happy with the results, and I’ll eventually use that as a jumping off point to design and get a PCB made before going any further. I want to continue with the PoE powered embedded NTP project when I get a chance to try out some new skills, such as PCB design.

Over the past few months, I’ve bought a couple of different modules (based on the u-blox 7M and 8M), and have been experimenting with them and the ones I’ve already discussed in this thread. It’s been a good exercise to learn a lot more about these modules, and how to program them to achieve specific goals. With that in mind, time to clean up the NaviSys (u-blox NEO-8M) implementation that’s been running since April and hovering around ±40ns accuracy, because everyone needs 2+ Stratum 1 NTP servers in their homelab…

I picked up a Dell Wyse 5070 Thin Client in August last year, and then some extra RAM, as well as a M.2 based Intel i210 GbE NIC. It’s a fanless “1 Litre” style box, and as configured it was about AU$140 total.

I’ve professed my love for repurposing these thin client units before, but this configuration has some particularly interesting features. The extra i210 supports hardware timestamping, and the DB9 RS232 port on the back of these units can be jumpered to output +5V on pin 1 or pin 9.

I buttoned up the NaviSys GM-8013R module which started this thread into a $3 project box from Jaycar (clear so I could see the LED), and using a $7 DB9 cable from Amazon. I’ve programmed the module to sync the PPS signal to UTC time, and only output PPS and UBX TIM-TP and NAV-TIMEUTC at 115200 baud. All NMEA sentences have been disabled, and using UBX binary output seems to be the most efficient way to get data out of the unit. I’ve tried to pare down the amount of data coming over the serial bus to the absolute bare minimum in the hope that I might squeeze a few extra ns out of the setup.

Together, the whole unit looks like this:

Power comes from pin 9 on the Wyse 5070. No extra cables to add, other than the power brick for the thin client, and two Ethernet cables.

Using Debian 13 (Trixie, stable) and a similar setup documented earlier in this thread, I’ve set up gpsd to decode the UBX data, and chrony to use the gpsd [UBX] data and PPS pulse on the RS232 port’s pin 1 to serve NTP.

Given that the i210 does hardware timestamping, this time around I’ve also set up the machine as a PTP Grandmaster (IEEE 1588‑2008 - version 2) using phc2sys and ptp4l (linuxptp on Debian). phc2sys synchronises the hardware clock in the i210 NIC to chrony’s time source, and then ptp4l does the actual PTP heavy lifting. It’s all “working”, but I don’t think I have anything else at home that speaks PTP, so I can’t test it :joy:. I’ll do a full hardware audit at some stage and see if I’ve got something here that can be a PTP client, but otherwise it might be a good excuse to buy some new toys :grin:. Either way, it’s set up so when I get some PTP capable devices I can get the time accuracy from milliseconds down to nanoseconds over the LAN.

This wasn’t intended as a full write up or tutorial, but just a bit of a project update (and a way to force myself to somewhat document what I’ve been doing). Happy to go into more detail or turn it into a talk if there’s some interest from the group. The whole unit draws 6-11W, and is very neat and tidy compared to some of my other NTP builds. With the addition of PTP, this is likely to be the production server at home for a while.

Next steps are to buy a serial extension cable and put the clear plastic box somewhere permanent and away from the RFI in my equipment rack at home. I also want to experiment with programming in a fixed position and whether that improves PPS stability, bearing in mind that I’m using a “navigation grade” module and not a “timing grade” one. I’ll leave this to settle for at least a few weeks first.

Excellent question, @jdownie from over a year ago. Let’s find out :joy:. It took me a year to come back to “I’ll try it in an LXC container”, but I got there eventually.

Wasn’t sure whether to break the Linux kernel stuff out into another thread, but I’ll leave it here as it is GPS/PPS/NTP related and I know search engines (and no doubt AI) scrape this site and it may be useful to someone else out there in internet land. It also forces me to try and retrace my steps over the past week, and gives me some documentation in case I need to remember what I did at some stage too :joy:.

Over the past week I’ve experimented with several configurations on the Wyse 5070 + GNSS module, with the ultimate goal of virtualising or containerising the NTP/PTP side of it. I know it shouldn’t ever be ever moved off the bare metal, but it was an interesting challenge to work through, and I was motivated by my desire to move Home Assistant and DNS to this machine and set it up as a bit of a low power “mission critical services” box, while keeping it as “standard Proxmox VE 9” as possible. I’ve now got it running very nicely under LXC on PVE 9, and will gradually move a few other VMs and containers across over the coming month while keeping an eye on the NTP stability. Yes, I could have run NTP on the Proxmox host itself, but I wanted to try and keep the NTP stuff separate if I could, so that it was nicely bundled up in something I could effortlessly back up, move between hardware (if need be), and also be more resilient to potential future Proxmox upgrades that may break something I’ve hand configured. The headline take away is don’t virtualise/containerise this stuff, but if you still want to do it anyway, it is possible to get something pretty solid going.

On the project side, I moved from UBX to NMEA as I was getting occasional failed packet decodes and wanted to eliminate some variables. I also cross checked the cable shielding was grounded, and added ferrite beads to the cable between the NaviSys unit and the 5070 to try and help with the occasional corrupted packet, but I don’t think those actions had any measurable impact overall.

I am not at all an expert in any of the kernel stuff, and was just muddling my way through incrementally trying options. If you spot something wrong or see an avenue for improvement or for me to learn, please let me know!


Having said all that, here is some tuning info that may be of interest to anyone building an NTP server, or who wants to containerise latency sensitive projects more generally:

On the Proxmox host, I set up a systemd unit to attach PPS from the NaviSys module:

[Unit]
Description=PPS from /dev/ttyS0
After=dev-ttyS0.device
Requires=dev-ttyS0.device

[Service]
Type=forking
ExecStart=/usr/sbin/ldattach pps /dev/ttyS0
Restart=always

[Install]
WantedBy=multi-user.target

This creates /dev/pps1 (pps0 being attached to the PTP capable network card) and kicks off the Linux internal kernel PPS. In a VM, I had some weirdness where asserts would arrive fine, and clears would arrive with stale timestamps. No idea what that was about, but it seems fine in a container. Can’t explain that one at all. I don’t think chrony uses the clears anyway, but it was still… odd.

Also on the Proxmox host, I changed the NTP server from the NTP pool to my (soon to be local) NTP server to avoid the host changing the system time out from under the NTP server in the container. It took me far longer than I care to admit to figure out where an occasional sudden time change would come from :woman_facepalming:.

New privileged LXC container (ID 123) and the conf file edited per the below:

arch: amd64
cores: 1
features: nesting=1
hostname: xxxxxxx
memory: 2048
ostype: debian
rootfs: local:123/vm-123-disk-0.raw,size=8G
swap: 512
lxc.net.0.type: phys
lxc.net.0.link: enp2s0
lxc.net.0.flags: up
lxc.cgroup2.cpuset.cpus: 3
lxc.cgroup2.devices.allow: c 246:* rwm
lxc.mount.entry: /dev/pps1 dev/pps1 none bind,optional,create=file
lxc.cgroup2.devices.allow: c 4:64 rwm
lxc.mount.entry: /dev/ttyS0 dev/ttyS0 none bind,optional,create=file
lxc.cap.drop: 
lxc.apparmor.profile: unconfined

Waaaaaaay more RAM and swap than I needed (it hasn’t exceeded 50MB since stabilising) but I’ll leave it be for now. I also experimented with an unprivileged container and had some odd issues, so I went back to privileged. Most of it is self explanatory, but some key things from the above:

  • nesting=1 (Debian won’t spawn a login prompt without it, and no doubt other stuff is broken too. It warns you about this, so it’s probably not something you’d choose to do, but I experimented without it so I’ll highlight that yes, you do need it).
  • lxc.apparmor.profile: unconfined honestly no idea if this is necessary or not. I never ran into any AppArmor issues in early experiments, and this ended up being a way to reduce the number of variables more than anything.
  • lxc.cgroup2.cpuset.cpus: 3 Pin the LXC container to Core 4 of the Celeron J4105 [zero indexed, i.e., it has cores numbered 0-3] in the Wyse 5070 (more on this later).
  • lxc.net... Pass through the Intel i210
  • lxc.cgroup2.devices.allow: c 246:* rwm + lxc.mount.entry: /dev/pps1 dev/pps1 none bind,optional,create=file Pass through /dev/pps1. Essentially, allow the LXC container to access all devices of type 246 (PPS), and then mount pps1 into the container, make it optional for container start up (in case something breaks), and create a file in /dev/ if it’s not there.
  • lxc.cgroup2.devices.allow: c 4:64 rwm + lxc.mount.entry: /dev/ttyS0 dev/ttyS0 none bind,optional,create=file As above, pass through /dev/ttyS0 (for UBX or NMEA data). Major device type 4 is tty and 64 is ttyS0 specifically.
  • lxc.cap.drop: Override the default LXC capabilities that Proxmox drops from the container (by blanking out that line). In this case, I want to allow the container to do a bunch of stuff containers shouldn’t normally be allowed to do (e.g., change sys_time).

Container created, gpsd + chrony installed, as per the million and one tutorials already on the internet. ppstest shows healthy asserts and clears, and cgps shows the data from the GNSS module coming through as per the above post.

/etc/default/gpsd and /etc/chrony/chrony.conf were fairly standard, but I’ll add the following notes.

My refclock lines in chrony.conf were:

refclock SOCK /var/run/chrony.ttyS0.sock delay 0.05 refid GNSS poll 3
refclock PPS /dev/pps1 lock GNSS refid PPS poll 0 prefer

For anyone following along at home, essentially the PPS signal is an accurate trigger of when a second boundary happens, but the data as to which second that is/was comes in via GPS/GNSS (or another NTP server or time source, for that matter). PPS needs to be as accurate as possible, the other time source can be slightly out.

I tried using SHM initially, but found SOCK to be much less delay and far less jittery. This seems to be the “more modern” way of communicating between gpsd and chrony anyway. delay 0.05 added about 36 hours ago as the 50ms delay between the GNSS data and PPS was pretty consistent on my system.

In gpsd:

DEVICES="/dev/ttyS0"

# Other options you want to pass to gpsd
GPSD_OPTIONS="-n -s 115200 -b"

# Automatically hot add/remove USB GPS devices via gpsdctl
USBAUTO="false"

GPSD_GROUP="dialout"

Not much here beyond the default or self-explanatory, but “-n” = no wait, run all the time (will be standard in any NTP tutorial), “-s 115200” (baud rate, previously programmed into the module). I did several experiments with and without -b, and ended up adding it again (after reprogramming the module back to my own config) to eliminate more variables. -b is “broken device mode” and essentially puts gpsd into read only mode rather than allowing gpsd to send commands to the module to change configs. I have no idea if I was running into odd firmware quirks with my module, but I found that life was better with the module being left alone, in my case. Likewise, the dialout group add was added at one stage due to some odd permissions issue I was having with the serial port. Not sure if it needs to be there, but it didn’t hurt to leave it there, so I have left it.

NIC and various routers/firewalls at home configured as one would normally configure a NIC/routers/firewalls. I also added GitHub - TheHuman00/chrony-stats: Lightweight : Monitoring Chrony and Network and Debian -- Details of package micro-httpd in trixie to the container to get some remote stats so I could obsess over every config tweak while it settled :joy:.


Back to the host:
/etc/grub/default edited to include the following:

GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt isolcpus=3 nohz_full=3 rcu_nocbs=3"

I added iommu=pt during my VM experiments, but don’t think it’s technically needed for LXC containers. Didn’t hurt to leave it there, so I left it.

isolcpus=3 nohz_full=3 rcu_nocbs=3 is arguably the more interesting part. Essentially, remove background tasks from Core 3 and remove Core 3 from the SMP “pool” for the machine to use, remove scheduling ticks from Core 3, and remove Read-Copy-Update background tasks from Core 3. Essentially, we want as much stuff removed from Core 3 as possible, so that the LXC container pinned to Core 3 has the full use of it for itself.

Lastly, on the host, cat /proc/interrupts shows what interrupts are happening on each CPU core. I won’t reproduce the whole lot here, but we want to pin the interrupts for /dev/ttyS0 and /dev/enp2s0 to Core 3, and move as many other interrupts off Core 3 as possible too.

In my case, that’s IRQ 4 (/dev/ttyS0) and IRQs 131-135 to Core 3, and the rest of cat /proc interrupts off Core 3 (I chose Core 0).

Won’t reproduce the whole lot here, but as examples:

  • echo 8 > /proc/irq/4/smp_affinity to move IRQ 4 to Core 3 (binary 1000 = decimal 8 = “Use this core? YNNN”)
  • echo 1 > /proc/irq/136/smp_affinity to move IRQ 136 to Core 0 (binary 0001 = decimal 1 = “Use this core? NNNY” - 136 in my case being enp1s0, i.e., unrelated hardware to the NTP container)

I’ll probably slap all that into a shell script to run at boot at some stage, but for now I’m just being really lazy and redoing it from .bash_history each time on the very rare occasions the Proxmox host needs to reboot. It also allows me to double check that the IRQs haven’t changed as I’ve been mucking around.

cat /proc/interrupts after the changes should show the interrupt counts incrementing on the appropriate core only from this point.

I also checked irqbalance wasn’t running (or installed. Doesn’t seem to be in a fresh PVE 9 install). I didn’t add irqaffinity or kthread_cpus into the GRUB boot parameters as I figured I didn’t want to accidentally stop the kernel (and therefore the PPS and LXC) from sitting on that core at all, nor remove default IRQs from core 3 entirely. I thought it’d be better to tinker after boot using smp_affinity, but I’m so far down into the weeds and outside my own area of knowledge at this point. Happy to learn from anyone out there more familiar with this than I am!

I also didn’t go to a RT kernel on the host, as I know Proxmox really wants to use its own version of the kernel and I wanted to keep Proxmox as stock standard as possible.


So, how is it performing?

In short, it’s accurate to within ~3–10 µs (~0.000003 to 0.000010 seconds) of UTC, and very stable. Not quite as tight as my bare metal experiments, but pretty close. Certainly close enough for a homelab Stratum 1 NTP server, and something that plays nicely with Proxmox.

Definitely given me a good source for home, and something to benchmark against when I come back to the ESP32 version of the project. For now, I will go outside and do some non-NTP related things :grin:.

chronyc sourcestats -v
                             .- Number of sample points in measurement set.
                            /    .- Number of residual runs with same sign.
                           |    /    .- Length of measurement set (time).
                           |   |    /      .- Est. clock freq error (ppm).
                           |   |   |      /           .- Est. error in freq.
                           |   |   |     |           /         .- Est. offset.
                           |   |   |     |          |          |   On the -.
                           |   |   |     |          |          |   samples. \
                           |   |   |     |          |          |             |
Name/IP Address            NP  NR  Span  Frequency  Freq Skew  Offset  Std Dev
==============================================================================
GNSS                       29  15   224     +0.493      0.171    +12us    16us
PPS                        64  28    63     -0.007      1.375     -2ns    58us
services-bne.aussiebroad>   7   4  1167     +4.084      1.922   -837us   374us
time.cloudflare.com         6   4   453    +12.950     39.818  -2860us  1347us
tic.ntp.telstra.net        11   6  172m     +4.164      0.581  -1960us  1493us
chronyc sources -v

  .-- Source mode  '^' = server, '=' = peer, '#' = local clock.
 / .- Source state '*' = current best, '+' = combined, '-' = not combined,
| /             'x' = may be in error, '~' = too variable, '?' = unusable.
||                                                 .- xxxx [ yyyy ] +/- zzzz
||      Reachability register (octal) -.           |  xxxx = adjusted offset,
||      Log2(Polling interval) --.      |          |  yyyy = measured offset,
||                                \     |          |  zzzz = estimated error.
||                                 |    |           \
MS Name/IP address         Stratum Poll Reach LastRx Last sample               
===============================================================================
#- GNSS                          0   3   377    10  +5227ns[+3784ns] +/-   25ms
#* PPS                           0   0   377     0    -31us[  -32us] +/-   30ns
^- services-bne.aussiebroad>     2   6   377    26  -3186us[-3185us] +/-   30ms
^- time.cloudflare.com           3   6   377     1  -6558us[-6559us] +/-   16ms
^- tic.ntp.telstra.net           2  10   377   221  -2703us[-2638us] +/-   37ms
chronyc selectdata -v
  . State: N - noselect, s - unsynchronised, M - missing samples,
 /         d/D - large distance, ~ - jittery, w/W - waits for others,
|          S - stale, O - orphan, T - not trusted, P - not preferred,
|          U - waits for update,, x - falseticker, + - combined, * - best.
|   Effective options   ---------.  (N - noselect, P - prefer
|   Configured options  ----.     \  T - trust, R - require)
|   Auth. enabled (Y/N) -.   \     \     Offset interval --.
|                        |    |     |                       |
S Name/IP Address        Auth COpts EOpts Last Score     Interval  Leap
=======================================================================
P GNSS                      N ----- -----   10   1.0   -25ms   +25ms  N
* PPS                       N -P--- -P---    0   1.0   -32us   -32us  N
P services-bne.aussiebroad> N ----- -----   25   1.0   -32ms   +26ms  N
P time.cloudflare.com       N ----- -----    0   1.0   -22ms +9025us  N
P tic.ntp.telstra.net       N ----- -----  220   1.0   -40ms   +30ms  N
chronyc tracking
Reference ID    : 50505300 (PPS)
Stratum         : 1
Ref time (UTC)  : Wed Feb 25 00:30:00 2026
System time     : 0.000003012 seconds slow of NTP time
Last offset     : -0.000000811 seconds
RMS offset      : 0.000004714 seconds
Frequency       : 4.430 ppm fast
Residual freq   : -0.007 ppm
Skew            : 1.408 ppm
Root delay      : 0.000000001 seconds
Root dispersion : 0.000017366 seconds
Update interval : 1.0 seconds
Leap status     : Normal

References:

Ahh, @Belfry "NTP, the final frontier. These are the voyages of the starship, HLB. Its five-year mission: to explore strange new worlds, to seek out new life and new civilizations, to boldly go where no man has gone before.”

I think this is a really important post but only understood about 10% of it. However, as we have discussed before, and you mention again above, this post may be crucial to that small group of people on the internet looking at this problem and hoping for a solution.

In that vein it is similar to my post on running a proxmox backup server in a virtual machine on QNAP hardware. With the right keywords Google will find it and all the AIs will absorb its content given that we do not restrict scraping.

However, it is unlikely to enthrall all visitors to HLB. I have therefore created a new category, Howtos, for these personal projects and IT journeys. I encourage others to post in that category for the benefit of future man and AI kind.