Discussion:
[Linuxptp-devel] ntp SHMs
Gary E. Miller
2015-02-20 22:45:46 UTC
Permalink
Yo All!

Gary E. Miller here, with the gpsd project.

I can't seem to access linuxptp-users or linuxptp-devel archives
on sourceforge. Am I missing something?

There has been recent conversations on gpsd-dev with Harlen about
changing the SHM interface. I see linuxptp uses that, anyone got
any opinions on changes?



BTW, I'm still working on getting LinuxPTP up on my local net. Are
there any howto docs? I don't want to use timemaster as I am very picky
about my chrony and ntpd configs.

RGDS
GARY
---------------------------------------------------------------------------
Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97701
***@rellim.com Tel:+1(541)382-8588
Richard Cochran
2015-02-21 15:31:43 UTC
Permalink
Post by Gary E. Miller
Yo All!
Gary E. Miller here, with the gpsd project.
I can't seem to access linuxptp-users or linuxptp-devel archives
on sourceforge. Am I missing something?
The web interface on SF is barely usable. Try gmane instead.

http://news.gmane.org/gmane.comp.linux.ptp.user
http://news.gmane.org/gmane.comp.linux.ptp.devel
Post by Gary E. Miller
There has been recent conversations on gpsd-dev with Harlen about
changing the SHM interface. I see linuxptp uses that, anyone got
any opinions on changes?
If anyone has an opinion, that would be Miroslav Lichvar, the author
of that code.
Post by Gary E. Miller
BTW, I'm still working on getting LinuxPTP up on my local net. Are
there any howto docs? I don't want to use timemaster as I am very picky
about my chrony and ntpd configs.
The best howto is the RedHat guide. Much of it also applies to
non-RedHat systems as well.

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Deployment_Guide/ch-Configuring_PTP_Using_ptp4l.html

Thanks,
Richard
Richard Cochran
2015-02-21 15:59:41 UTC
Permalink
Post by Gary E. Miller
BTW, I'm still working on getting LinuxPTP up on my local net. Are
there any howto docs? I don't want to use timemaster as I am very picky
about my chrony and ntpd configs.
And check the list of supported NICs in the README. Unlike NTP,
linuxptp can only work with supported hardware, even for software time
stamping. We do have a number of popular cards, but it also depends
on the kernel version: the newer, the better.

Thanks,
Richard
Gary E. Miller
2015-02-22 01:51:16 UTC
Permalink
Yo Richard!

On Sat, 21 Feb 2015 16:59:41 +0100
Post by Richard Cochran
Post by Gary E. Miller
BTW, I'm still working on getting LinuxPTP up on my local net. Are
there any howto docs? I don't want to use timemaster as I am very
picky about my chrony and ntpd configs.
And check the list of supported NICs in the README. Unlike NTP,
linuxptp can only work with supported hardware, even for software time
stamping. We do have a number of popular cards, but it also depends
on the kernel version: the newer, the better.
I guess I got lucky and software timestamping is working for me on
an Intel 80003ES2LAN. It uses the e1000e driver, marked as hardware
supported, but does not have hardware support.

I'm getting 5 uSec jitter between one hardware timing host and one software
time host.

Sadly it would not work on a WiFi port, even though 'ethtool -T' shows
the port supports software timestamping. Is there a better way to tell
what it supported?

I am using kernel 3.19.0.

The supported NICs are a bit pricey, $40 and up, when a nice GigE
card can be $15. I ordered two so I can test a three way hardware
setup.

I have started to add a PTP section to the gpsd time sering howto, but
it will need more.

RGDS
GARY
---------------------------------------------------------------------------
Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97701
***@rellim.com Tel:+1(541)382-8588
Richard Cochran
2015-02-22 14:05:48 UTC
Permalink
Post by Gary E. Miller
I guess I got lucky and software timestamping is working for me on
an Intel 80003ES2LAN. It uses the e1000e driver, marked as hardware
supported, but does not have hardware support.
Right, the e1000e driver supports a whole family of cards, only some
of which have hardware time stamping.
Post by Gary E. Miller
I'm getting 5 uSec jitter between one hardware timing host and one software
time host.
Sounds about right, actually pretty good. Adding a CPU and/or network
load should make it worse.
Post by Gary E. Miller
Sadly it would not work on a WiFi port, even though 'ethtool -T' shows
the port supports software timestamping. Is there a better way to tell
what it supported?
Unfortunately not. Things are getting better as time goes on.

Regarding wifi, does it say "software-transmit" as well as
"software-receive", or software-receive only?

PTP doesn't work well anyhow over wifi because of the unpredicable
transmission scheme. The newer wireless protocol actually includes a
ptp-like synchronization scheme, but I am not aware of any hardware or
software implementations.
Post by Gary E. Miller
I am using kernel 3.19.0.
Good.
Post by Gary E. Miller
The supported NICs are a bit pricey, $40 and up, when a nice GigE
card can be $15. I ordered two so I can test a three way hardware
setup.
The trend is to include PTP support in newer designs. I guess that it
will eventually become a standard feature in most MACs.


Cheers,
Richard
Gary E. Miller
2015-02-22 21:11:12 UTC
Permalink
Yo Richard!

On Sun, 22 Feb 2015 15:05:48 +0100
Post by Richard Cochran
Post by Gary E. Miller
I'm getting 5 uSec jitter between one hardware timing host and one
software time host.
Sounds about right, actually pretty good. Adding a CPU and/or network
load should make it worse.
Yup, I clearly see that, nothing worse then 7 uSec yet.
Post by Richard Cochran
Regarding wifi, does it say "software-transmit" as well as
"software-receive", or software-receive only?
No. I noticed how the Redhat HOWTO specifies all three needed. That
should be in the linuxptp README.
Post by Richard Cochran
PTP doesn't work well anyhow over wifi because of the unpredicable
transmission scheme.
Can it be worse than just plain old NTP?
Post by Richard Cochran
The newer wireless protocol actually includes a
ptp-like synchronization scheme, but I am not aware of any hardware or
software implementations.
Interesting...

Sadly no luck getting phc2sys to work for me on a system already
running ptp4l. I change time_stamping software to hardware and run this:

kong ~ # phc2sys -a -m -l 7
phc2sys[880.851]: PI servo: sync interval 1.000 kp 0.700 ki 0.300000
phc2sys[881.852]: reconfiguring after port state change
phc2sys[881.852]: master clock not ready, waiting...

Not sure what to do now. I wish there was a -l 8 to output the
automatic settings.

I looked at doing manual settings and go confused. Like what to set
for -M? Same as ptp4l or unique? Got my system clock off by 24 hours
and gave up...

BTW, here is what my ethtool says:

kong ~ # ethtool -T eth0
Time stamping parameters for eth0:
Capabilities:
hardware-transmit (SOF_TIMESTAMPING_TX_HARDWARE)
software-transmit (SOF_TIMESTAMPING_TX_SOFTWARE)
hardware-receive (SOF_TIMESTAMPING_RX_HARDWARE)
software-receive (SOF_TIMESTAMPING_RX_SOFTWARE)
software-system-clock (SOF_TIMESTAMPING_SOFTWARE)
hardware-raw-clock (SOF_TIMESTAMPING_RAW_HARDWARE)
PTP Hardware Clock: 0
Hardware Transmit Timestamp Modes:
off (HWTSTAMP_TX_OFF)
on (HWTSTAMP_TX_ON)
Hardware Receive Filter Modes:
none (HWTSTAMP_FILTER_NONE)
all (HWTSTAMP_FILTER_ALL)
ptpv1-l4-sync (HWTSTAMP_FILTER_PTP_V1_L4_SYNC)
ptpv1-l4-delay-req (HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ)
ptpv2-l4-sync (HWTSTAMP_FILTER_PTP_V2_L4_SYNC)
ptpv2-l4-delay-req (HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ)
ptpv2-l2-sync (HWTSTAMP_FILTER_PTP_V2_L2_SYNC)
ptpv2-l2-delay-req (HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ)
ptpv2-event (HWTSTAMP_FILTER_PTP_V2_EVENT)
ptpv2-sync (HWTSTAMP_FILTER_PTP_V2_SYNC)
ptpv2-delay-req (HWTSTAMP_FILTER_PTP_V2_DELAY_REQ)




RGDS
GARY
---------------------------------------------------------------------------
Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97701
***@rellim.com Tel:+1(541)382-8588
Richard Cochran
2015-02-23 08:50:43 UTC
Permalink
Post by Gary E. Miller
No. I noticed how the Redhat HOWTO specifies all three needed. That
should be in the linuxptp README.
Good point.
Post by Gary E. Miller
Post by Richard Cochran
PTP doesn't work well anyhow over wifi because of the unpredicable
transmission scheme.
Can it be worse than just plain old NTP?
I would think it can, depending on the PI weights.
Post by Gary E. Miller
Sadly no luck getting phc2sys to work for me on a system already
kong ~ # phc2sys -a -m -l 7
This would be without ntpshm.
Post by Gary E. Miller
phc2sys[880.851]: PI servo: sync interval 1.000 kp 0.700 ki 0.300000
phc2sys[881.852]: reconfiguring after port state change
phc2sys[881.852]: master clock not ready, waiting...
For a simple run without ntpshm, try this:

phc2sys -a -r -i eth0 -m -q

-a automatic mode
-r include system clock as a possible slave
-i interface that ptp4l is running on
-m messages to stdout/stderr
-q no messages to syslog
Post by Gary E. Miller
I looked at doing manual settings and go confused. Like what to set
for -M? Same as ptp4l or unique?
So, backing up a bit, when using hardware time stamping, the ptp4l
servo should never be set to ntpshm. That is only for software time
stamping.

For a HW time stamping setup, I would run ptp4l with defaults, and
then phc2sys like so.

phc2sys -s eth0 -E ntpshm -m -q

The -M value only has to agree with your ntp setup.

Miroslav?

I admit the options for phc2sys are really confusing. On the one
hand, the phc2sys underwent a "organic" development process, and on
the other hand, there really are a great many ways to configure the
system clock and one (or more!) PTP hardware clocks. I always have to
re-read the man page every time.
Post by Gary E. Miller
Got my system clock off by 24 hours
and gave up...
Oops.

Sorry,
Richard
Gary E. Miller
2015-02-23 19:50:27 UTC
Permalink
Yo Richard!

On Mon, 23 Feb 2015 09:50:43 +0100
Post by Richard Cochran
Post by Gary E. Miller
Post by Richard Cochran
PTP doesn't work well anyhow over wifi because of the unpredicable
transmission scheme.
Can it be worse than just plain old NTP?
I would think it can, depending on the PI weights.
Ouch... But I guess pure theory until a WiFi driver has real support.
Post by Richard Cochran
Post by Gary E. Miller
Sadly no luck getting phc2sys to work for me on a system already
kong ~ # phc2sys -a -m -l 7
This would be without ntpshm.
Odd, so automatic configuration not so automatic. I told ptp4l to
use ntpshm, that should be passed along.
Post by Richard Cochran
Post by Gary E. Miller
phc2sys[880.851]: PI servo: sync interval 1.000 kp 0.700 ki 0.300000
phc2sys[881.852]: reconfiguring after port state change
phc2sys[881.852]: master clock not ready, waiting...
phc2sys -a -r -i eth0 -m -q
That fails more than one way:

kong ~ # phc2sys -a -r -i eth0 -m -q
'-i' has been deprecated. please use '-s' instead.
autoconfiguration cannot be mixed with manual config options.
Post by Richard Cochran
Post by Gary E. Miller
I looked at doing manual settings and go confused. Like what to set
for -M? Same as ptp4l or unique?
So, backing up a bit, when using hardware time stamping, the ptp4l
servo should never be set to ntpshm. That is only for software time
stamping.
That seems odd. It is standard to run chronyd with input from NMEA
(software) time and PPS (hardware) time. Makes it easy to see the
quality of each.

So what, backing up more, should my ptp4l be set to?

Here is my config:

kong ~ # cat /usr/local/etc/ptp4l.conf
[global]
# free_running 1
summary_interval 10
time_stamping software
clock_servo ntpshm
ntpshm_segment 2

[eth0]
Post by Richard Cochran
For a HW time stamping setup, I would run ptp4l with defaults,
Which will not start for me:

kong ~ # ptp4l
no interface specified

But this works:

kong ~ # ptp4l -i eth0 &
Post by Richard Cochran
and
then phc2sys like so.
phc2sys -s eth0 -E ntpshm -m -q
The -M value only has to agree with your ntp setup.
Which also fails instantly for me:

kong ~ # phc2sys -s eth0 -E ntpshm -m -q -M 2
time offset must be specified using -w or -O

This works, but is off by a mere 4,000 Seconds!

kong ~ # ptp4l -i eth0 &
kong ~ # phc2sys -s eth0 -E ntpshm -m -q -M 2 -w
phc2sys[82492.705]: phc offset -59931 s0 freq +0 delay 1580
phc2sys[82493.705]: phc offset -58092 s0 freq +0 delay 1509
phc2sys[82494.705]: phc offset -56906 s0 freq +0 delay 1516
phc2sys[82495.705]: phc offset -56815 s0 freq +0 delay 1516
phc2sys[82496.705]: phc offset -56684 s0 freq +0 delay 1517
phc2sys[82497.705]: phc offset -70368744234187 s0 freq +0 delay 1516
phc2sys[82498.706]: phc offset -70368744232479 s0 freq +0 delay 1517
phc2sys[82499.706]: phc offset -140737488408265 s0 freq +0 delay 1516
phc2sys[82500.706]: phc offset -140737488408179 s0 freq +0 delay 1516

Way crazy!
Post by Richard Cochran
I admit the options for phc2sys are really confusing. On the one
hand, the phc2sys underwent a "organic" development process, and on
the other hand, there really are a great many ways to configure the
system clock and one (or more!) PTP hardware clocks. I always have to
re-read the man page every time.
Which is why I am working on this:

http://www.catb.org/gpsd/gpsd-time-service-howto.html

And another reason we are trying to come up with an improved SHM
protocol on gpsd-dev and ntpd-dev. Way too easy for two servers
to clobber one shmid.

Any reason not to merge ptp4l and phc2sys?

And another cesspool is pmc. I see it makes a nice low level tool, but
time for a simple way to use it like 'pmc -a' and 'pmc -A' which would
dump all possible data in short and long form, like 'hdparm -i' and
'hdparm -I' do for disks.

RGDS
GARY
---------------------------------------------------------------------------
Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97701
***@rellim.com Tel:+1(541)382-8588
Keller, Jacob E
2015-02-23 21:09:01 UTC
Permalink
Hi,

I thought I would reply to answer some of your questions that I can,
Post by Gary E. Miller
Yo Richard!
On Mon, 23 Feb 2015 09:50:43 +0100
Post by Richard Cochran
Post by Gary E. Miller
Post by Richard Cochran
PTP doesn't work well anyhow over wifi because of the unpredicable
transmission scheme.
Can it be worse than just plain old NTP?
I would think it can, depending on the PI weights.
Ouch... But I guess pure theory until a WiFi driver has real support.
NTP and PTP are generally considered solutions to different but related
problems, so this shouldn't really be an issue.
Post by Gary E. Miller
Post by Richard Cochran
Post by Gary E. Miller
Sadly no luck getting phc2sys to work for me on a system already
kong ~ # phc2sys -a -m -l 7
This would be without ntpshm.
Odd, so automatic configuration not so automatic. I told ptp4l to
use ntpshm, that should be passed along.
ptp4l in hardware timestamp mode controls a hardware clock, which is
completely unrelated to the system time clock. Then, phc2sys controls
the system clock based on the hardware clock which is set by ptp4l

This is sort of the flow.

network -> ptp4l -> NIC hardware clock -> phc2sys -> system clock

Telling ptp4l to use the ntpshm doesn't work unless ptp4l is in software
mode because ptp4l doesn't control teh software clock in hardware
timestamp mode.
Post by Gary E. Miller
Post by Richard Cochran
Post by Gary E. Miller
phc2sys[880.851]: PI servo: sync interval 1.000 kp 0.700 ki 0.300000
phc2sys[881.852]: reconfiguring after port state change
phc2sys[881.852]: master clock not ready, waiting...
phc2sys -a -r -i eth0 -m -q
kong ~ # phc2sys -a -r -i eth0 -m -q
'-i' has been deprecated. please use '-s' instead.
autoconfiguration cannot be mixed with manual config options.
Yea, I think you just need to drop the -i eth0 part.
Post by Gary E. Miller
Post by Richard Cochran
Post by Gary E. Miller
I looked at doing manual settings and go confused. Like what to set
for -M? Same as ptp4l or unique?
So, backing up a bit, when using hardware time stamping, the ptp4l
servo should never be set to ntpshm. That is only for software time
stamping.
That seems odd. It is standard to run chronyd with input from NMEA
(software) time and PPS (hardware) time. Makes it easy to see the
quality of each.
So what, backing up more, should my ptp4l be set to?
kong ~ # cat /usr/local/etc/ptp4l.conf
[global]
# free_running 1
summary_interval 10
time_stamping software
clock_servo ntpshm
ntpshm_segment 2
[eth0]
Post by Richard Cochran
For a HW time stamping setup, I would run ptp4l with defaults,
kong ~ # ptp4l
no interface specified
ptp4l does not by default look up any configuration file. default.cfg is
provided to show the default settings. You must specify the
configuration file via "-f" option.
Post by Gary E. Miller
kong ~ # ptp4l -i eth0 &
This won't do what you think because you didn't actually provide the
configuration file.
Post by Gary E. Miller
Post by Richard Cochran
and
then phc2sys like so.
phc2sys -s eth0 -E ntpshm -m -q
The -M value only has to agree with your ntp setup.
kong ~ # phc2sys -s eth0 -E ntpshm -m -q -M 2
time offset must be specified using -w or -O
This works, but is off by a mere 4,000 Seconds!
kong ~ # ptp4l -i eth0 &
kong ~ # phc2sys -s eth0 -E ntpshm -m -q -M 2 -w
phc2sys[82492.705]: phc offset -59931 s0 freq +0 delay 1580
phc2sys[82493.705]: phc offset -58092 s0 freq +0 delay 1509
phc2sys[82494.705]: phc offset -56906 s0 freq +0 delay 1516
phc2sys[82495.705]: phc offset -56815 s0 freq +0 delay 1516
phc2sys[82496.705]: phc offset -56684 s0 freq +0 delay 1517
phc2sys[82497.705]: phc offset -70368744234187 s0 freq +0 delay 1516
phc2sys[82498.706]: phc offset -70368744232479 s0 freq +0 delay 1517
phc2sys[82499.706]: phc offset -140737488408265 s0 freq +0 delay 1516
phc2sys[82500.706]: phc offset -140737488408179 s0 freq +0 delay 1516
Way crazy!
Post by Richard Cochran
I admit the options for phc2sys are really confusing. On the one
hand, the phc2sys underwent a "organic" development process, and on
the other hand, there really are a great many ways to configure the
system clock and one (or more!) PTP hardware clocks. I always have to
re-read the man page every time.
I am (slowly) working on updating the configuration code for all our
utilities which should help alleviate some of these issues.

Regards,
Jake
Post by Gary E. Miller
http://www.catb.org/gpsd/gpsd-time-service-howto.html
And another reason we are trying to come up with an improved SHM
protocol on gpsd-dev and ntpd-dev. Way too easy for two servers
to clobber one shmid.
Any reason not to merge ptp4l and phc2sys?
I'll let others comment on this.
Post by Gary E. Miller
And another cesspool is pmc. I see it makes a nice low level tool, but
time for a simple way to use it like 'pmc -a' and 'pmc -A' which would
dump all possible data in short and long form, like 'hdparm -i' and
'hdparm -I' do for disks.
pmc is used to directly talk the management protocol specified by the
PTP standard. I don't know if pmc -a or pmc -A make sense.

Regards,
Jake
Gary E. Miller
2015-02-23 21:48:28 UTC
Permalink
Yo Jacob E!

On Mon, 23 Feb 2015 21:09:01 +0000
Post by Keller, Jacob E
Post by Gary E. Miller
Ouch... But I guess pure theory until a WiFi driver has real support.
NTP and PTP are generally considered solutions to different but
related problems, so this shouldn't really be an issue.
As a gpsd maintainer and longtime chronyd/ntpd user I 100% disagree!

Not having chronyd or ntpd in the loop is a total non-starter for me.
Post by Keller, Jacob E
Post by Gary E. Miller
Odd, so automatic configuration not so automatic. I told ptp4l to
use ntpshm, that should be passed along.
ptp4l in hardware timestamp mode controls a hardware clock, which is
completely unrelated to the system time clock.
Understood.
Post by Keller, Jacob E
Then, phc2sys controls
the system clock based on the hardware clock which is set by ptp4l
Gack, maybe that is one way, but not a good way. Certainly not the
only way.

Since ptp4l can control the ntpshm in software timestamping mode why
can it not do so in hardware timestamping mode? Does ptp4l in hardware
mode ignore ntpsm in hardware mode, or does it continue to post parallel
software mode timing?
Post by Keller, Jacob E
This is sort of the flow.
network -> ptp4l -> NIC hardware clock -> phc2sys -> system clock
So how do I get ntpshm in there?

And not valid for a local PPS based PTP master, maybe mmore like:

PPS -> ntpd -> local sysclock -> ptp4l -> NIC hardware clock -> phc2sys -> ntpshm

And it has to be a loop, so when PPS is lost that host can feed the hardware
clock back into ntpd.
Post by Keller, Jacob E
Telling ptp4l to use the ntpshm doesn't work unless ptp4l is in
software mode because ptp4l doesn't control teh software clock in
hardware timestamp mode.
Weird. Why are not ptp4l and phc2sys one program? They are certainly
deeply intertwined. And ptp4l in software mode already does most of what
phc2sys does in hardware mode.
Post by Keller, Jacob E
Post by Gary E. Miller
kong ~ # phc2sys -a -r -i eth0 -m -q
'-i' has been deprecated. please use '-s' instead.
autoconfiguration cannot be mixed with manual config options.
Yea, I think you just need to drop the -i eth0 part.
With no -i phc2sys complains it has not port to connect to

And not replace with the -s? A bug in the error message?
Post by Keller, Jacob E
Post by Gary E. Miller
kong ~ # ptp4l
no interface specified
ptp4l does not by default look up any configuration file.
Understood. Another reason why I have only been runnning in default
mode when asked to do that, to prove that default mode does not work.
Post by Keller, Jacob E
default.cfg
is provided to show the default settings. You must specify the
configuration file via "-f" option.
Yup, if you look at my incomplete HOWTO you see I do that:

http://www.catb.org/gpsd/gpsd-time-service-howto.html
Post by Keller, Jacob E
Post by Gary E. Miller
kong ~ # ptp4l -i eth0 &
This won't do what you think because you didn't actually provide the
configuration file.
Clearly it is non-functional. But it does what I was asked to try and
proved it does not work in any rational manner.
Post by Keller, Jacob E
Post by Gary E. Miller
Post by Richard Cochran
I admit the options for phc2sys are really confusing. On the one
hand, the phc2sys underwent a "organic" development process, and
on the other hand, there really are a great many ways to
configure the system clock and one (or more!) PTP hardware
clocks. I always have to re-read the man page every time.
I am (slowly) working on updating the configuration code for all our
utilities which should help alleviate some of these issues.
I look forward to that, but I've been told this can work as it exists
now. Not found any way to make hardware mode work (with ntpshm) yet.
Post by Keller, Jacob E
Post by Gary E. Miller
And another cesspool is pmc. I see it makes a nice low level tool,
but time for a simple way to use it like 'pmc -a' and 'pmc -A'
which would dump all possible data in short and long form, like
'hdparm -i' and 'hdparm -I' do for disks.
pmc is used to directly talk the management protocol specified by the
PTP standard. I don't know if pmc -a or pmc -A make sense.
Why not? hdparm implements the SATA spec at low and high levels. Ditto
sdparm for SCSI and stty for RS-232.

This stuff needs to be de-mystified and automated if real users are
going to use it. RedHat has 10 pages of HOWTO and they still don't
get it right.

RGDS
GARY
---------------------------------------------------------------------------
Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97701
***@rellim.com Tel:+1(541)382-8588
Keller, Jacob E
2015-02-23 22:40:17 UTC
Permalink
Hi,
Post by Gary E. Miller
Yo Jacob E!
On Mon, 23 Feb 2015 21:09:01 +0000
Post by Keller, Jacob E
Post by Gary E. Miller
Ouch... But I guess pure theory until a WiFi driver has real support.
NTP and PTP are generally considered solutions to different but
related problems, so this shouldn't really be an issue.
As a gpsd maintainer and longtime chronyd/ntpd user I 100% disagree!
Not having chronyd or ntpd in the loop is a total non-starter for me.
I think I was mis-leading in my original comment here. I meant to say
that accuracy of PTP vs NTP over WiFi could very well be different
simply because they are very different algorithms with somewhat related
but differing goals. I did not mean to say that a full solution would
not use ntpd or chronyd or similar.
Post by Gary E. Miller
Post by Keller, Jacob E
Post by Gary E. Miller
Odd, so automatic configuration not so automatic. I told ptp4l to
use ntpshm, that should be passed along.
ptp4l in hardware timestamp mode controls a hardware clock, which is
completely unrelated to the system time clock.
Understood.
Post by Keller, Jacob E
Then, phc2sys controls
the system clock based on the hardware clock which is set by ptp4l
Gack, maybe that is one way, but not a good way. Certainly not the
only way.
Since ptp4l can control the ntpshm in software timestamping mode why
can it not do so in hardware timestamping mode? Does ptp4l in hardware
mode ignore ntpsm in hardware mode, or does it continue to post parallel
software mode timing?
This is a complex problem. Hardware timestamping is done by varying
hardware at the MAC (or PHY) level. Each hardware has its own internal
clock to do this. These clocks are in *no* way driven by or in sync with
the kernel clocks. Thus, the ptp kernel core exposes these each as their
own clock device /dev/ptpX

ptp4l in hardware mode completely ignores anything related to the
software kernel clock world. This is because hardware NICs have their
own internal clocks for doing hardware timestamping. Thus all timestamps
are relative to the clock on the MAC, not to any clock on the
motherboard.

There have been attempts in the past for hardware to "sync" to the
kernel clock but these were almost all faulty for various reasons
including that they can't run a full servo in the kernel.

There really doesn't make any sense for ptp4l to attempt to relate them
to the software kernel clock at all. Thus, ptp4l will drive the ptp
hardware clock from PTP on the network. Then a separate program (phc2sys
or other) drives the software->hardware sync.

*however* when in software timestamp mode the origin of the timestamps
*is* the kernel clock, so thus we drive the kernel timer from ptp4l
directly.

In hardware mode we are not driving the clock directly but only
indirectly.

I hope this makes sense?
Post by Gary E. Miller
Post by Keller, Jacob E
This is sort of the flow.
network -> ptp4l -> NIC hardware clock -> phc2sys -> system clock
So how do I get ntpshm in there?
I assume you tell phc2sys to use ntpshm?
Post by Gary E. Miller
PPS -> ntpd -> local sysclock -> ptp4l -> NIC hardware clock -> phc2sys -> ntpshm
And it has to be a loop, so when PPS is lost that host can feed the hardware
clock back into ntpd.
Post by Keller, Jacob E
Telling ptp4l to use the ntpshm doesn't work unless ptp4l is in
software mode because ptp4l doesn't control teh software clock in
hardware timestamp mode.
Weird. Why are not ptp4l and phc2sys one program? They are certainly
deeply intertwined. And ptp4l in software mode already does most of what
phc2sys does in hardware mode.
I'll leave Richard and others to answer this question, though I suspect
because ptp4l controls only the one clock which is handling
timestamping. It just happens that in software mode this is the kernel
clock.
Post by Gary E. Miller
Post by Keller, Jacob E
Post by Gary E. Miller
kong ~ # phc2sys -a -r -i eth0 -m -q
'-i' has been deprecated. please use '-s' instead.
autoconfiguration cannot be mixed with manual config options.
Yea, I think you just need to drop the -i eth0 part.
With no -i phc2sys complains it has not port to connect to
And not replace with the -s? A bug in the error message?
I am unsure, but... it should be connecting to the ptp4l Unix socket to
get its configuration from the ptp4l daemon. I'll defer to someone who
is more familiar with automatic mode.
Post by Gary E. Miller
Post by Keller, Jacob E
Post by Gary E. Miller
kong ~ # ptp4l
no interface specified
ptp4l does not by default look up any configuration file.
Understood. Another reason why I have only been runnning in default
mode when asked to do that, to prove that default mode does not work.
I only mentioned you had to pass -f because you said you ran "ptp4l" on
its own without passing the -f parameter. This means anything in your
configuration wouldn't be applied.
Post by Gary E. Miller
Post by Keller, Jacob E
default.cfg
is provided to show the default settings. You must specify the
configuration file via "-f" option.
http://www.catb.org/gpsd/gpsd-time-service-howto.html
Post by Keller, Jacob E
Post by Gary E. Miller
kong ~ # ptp4l -i eth0 &
This won't do what you think because you didn't actually provide the
configuration file.
Clearly it is non-functional. But it does what I was asked to try and
proved it does not work in any rational manner.
Post by Keller, Jacob E
Post by Gary E. Miller
Post by Richard Cochran
I admit the options for phc2sys are really confusing. On the one
hand, the phc2sys underwent a "organic" development process, and
on the other hand, there really are a great many ways to
configure the system clock and one (or more!) PTP hardware
clocks. I always have to re-read the man page every time.
I am (slowly) working on updating the configuration code for all our
utilities which should help alleviate some of these issues.
I look forward to that, but I've been told this can work as it exists
now. Not found any way to make hardware mode work (with ntpshm) yet.
Hopefully we can get that sorted out. I don't have a solution to it
offhand,
Post by Gary E. Miller
Post by Keller, Jacob E
Post by Gary E. Miller
And another cesspool is pmc. I see it makes a nice low level tool,
but time for a simple way to use it like 'pmc -a' and 'pmc -A'
which would dump all possible data in short and long form, like
'hdparm -i' and 'hdparm -I' do for disks.
pmc is used to directly talk the management protocol specified by the
PTP standard. I don't know if pmc -a or pmc -A make sense.
Why not? hdparm implements the SATA spec at low and high levels. Ditto
sdparm for SCSI and stty for RS-232.
I mean to say that I am unsure if pmc should be the "hdparm" equivalent.
I am not sure the management protocol is exactly what you are looking
for here. Maybe it is.
Post by Gary E. Miller
This stuff needs to be de-mystified nand automated if real users are
going to use it. RedHat has 10 pages of HOWTO and they still don't
get it right.
PTP is quite complicated, so this is not entirely unsurprising.
Post by Gary E. Miller
RGDS
GARY
---------------------------------------------------------------------------
Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97701
Gary E. Miller
2015-02-23 23:22:17 UTC
Permalink
Yo Jacob E!

On Mon, 23 Feb 2015 22:40:17 +0000
Post by Keller, Jacob E
Post by Gary E. Miller
Not having chronyd or ntpd in the loop is a total non-starter for me.
I think I was mis-leading in my original comment here. I meant to say
that accuracy of PTP vs NTP over WiFi could very well be different
simply because they are very different algorithms with somewhat
related but differing goals. I did not mean to say that a full
solution would not use ntpd or chronyd or similar.
Fair enough.
Post by Keller, Jacob E
Post by Gary E. Miller
Since ptp4l can control the ntpshm in software timestamping mode why
can it not do so in hardware timestamping mode? Does ptp4l in
hardware mode ignore ntpsm in hardware mode, or does it continue to
post parallel software mode timing?
This is a complex problem. Hardware timestamping is done by varying
hardware at the MAC (or PHY) level. Each hardware has its own internal
clock to do this. These clocks are in *no* way driven by or in sync
with the kernel clocks. Thus, the ptp kernel core exposes these each
as their own clock device /dev/ptpX
Understood.
Post by Keller, Jacob E
ptp4l in hardware mode completely ignores anything related to the
software kernel clock world. This is because hardware NICs have their
own internal clocks for doing hardware timestamping. Thus all
timestamps are relative to the clock on the MAC, not to any clock on
the motherboard.
Are you saying options like -M are ignored in hardware timestamp mode?

If so should not -M be an illegal option in hardware mode?
Post by Keller, Jacob E
There have been attempts in the past for hardware to "sync" to the
kernel clock but these were almost all faulty for various reasons
including that they can't run a full servo in the kernel.
I would never suggest a thing. That is a job for ntpd and some sort
of refclock shimm (maybe hpc2sys).
Post by Keller, Jacob E
There really doesn't make any sense for ptp4l to attempt to relate
them to the software kernel clock at all.
Nor would I suggest such a thing. If possible, both the hardware
and software timestamps should be independent and both accessible.
Post by Keller, Jacob E
Thus, ptp4l will drive the
ptp hardware clock from PTP on the network. Then a separate program
(phc2sys or other) drives the software->hardware sync.
Lost me. Whether you partition a thing into two programs, or one
program with two threads, or some other combo does not follow from your
argument.

Conceptually, and I suggest not seriously (yet), hpc2sys could easily
just be a thread launched inside ptp4l when in hardware mode.
Post by Keller, Jacob E
*however* when in software timestamp mode the origin of the timestamps
*is* the kernel clock, so thus we drive the kernel timer from ptp4l
directly.
Uh, no. Not in my case, I have ptp4l driving ptp4l driving ntpd
which drives the kernel clock. This is the point of this config
line in my /usr/local/etc/ptp4l.conf:

clock_servo ntpshm
Post by Keller, Jacob E
In hardware mode we are not driving the clock directly but only
indirectly.
Unclear who 'we' is.
Post by Keller, Jacob E
I hope this makes sense?
Nope.
Post by Keller, Jacob E
Post by Gary E. Miller
Post by Keller, Jacob E
This is sort of the flow.
network -> ptp4l -> NIC hardware clock -> phc2sys -> system clock
So how do I get ntpshm in there?
I assume you tell phc2sys to use ntpshm?
Well, I am trying, and failing. Automatic mode fails flagrantly and
any manual mode I try either fails, does nothing, or totally flalis my
system clock.
Post by Keller, Jacob E
Post by Gary E. Miller
Weird. Why are not ptp4l and phc2sys one program? They are
certainly deeply intertwined. And ptp4l in software mode already
does most of what phc2sys does in hardware mode.
I'll leave Richard and others to answer this question, though I
suspect because ptp4l controls only the one clock which is handling
timestamping. It just happens that in software mode this is the kernel
clock.
And ptp4l sends the acquired timestamps to ntpshm in software mode. Seems
to me it just needs to instead read the hardware timestamp and sent it?

Or ptp4l could spawn htc2sys when hardware mode is selected? Or
-a mode could just work?
Post by Keller, Jacob E
Post by Gary E. Miller
Post by Keller, Jacob E
Post by Gary E. Miller
kong ~ # ptp4l
no interface specified
ptp4l does not by default look up any configuration file.
Understood. Another reason why I have only been runnning in default
mode when asked to do that, to prove that default mode does not work.
I only mentioned you had to pass -f because you said you ran "ptp4l"
on its own without passing the -f parameter. This means anything in
your configuration wouldn't be applied.
Understood. I'm trying everything suggested, no matter how odd, because
nothing seems to work.

I'll try anything you suggest, as long as it brings ntpshm into the
picture.
Post by Keller, Jacob E
Post by Gary E. Miller
Why not? hdparm implements the SATA spec at low and high levels.
Ditto sdparm for SCSI and stty for RS-232.
I mean to say that I am unsure if pmc should be the "hdparm"
equivalent. I am not sure the management protocol is exactly what you
are looking for here. Maybe it is.
Pending a better idea, and someone to code it, I'll stick to this one.
It would be pretty simple to code.

gpsd ended up with several tools for real time data visualization. If
someone were to write a pmc-like gui that real time displayed the PTP
data I would love it. The human brain is amazing at seeing patterns in
visual data. But I do not want to suggest work to volunteers beyond
what I need for basic functionality.
Post by Keller, Jacob E
Post by Gary E. Miller
This stuff needs to be de-mystified nand automated if real users are
going to use it. RedHat has 10 pages of HOWTO and they still don't
get it right.
PTP is quite complicated, so this is not entirely unsurprising.
Remember all the knobs that used to be on TVs and Oscopes? They were
really complicated! But engineers fixed that.

GPSs and file systems are also complicated. But in 99% of the case I
can just say "gps -n /dev/ttyS0" or "mount /dev/sda /mnt/hd" and things
just work. I enjoy being in the 1% that flips every switch, but we
should not inflict that on the 99%.

For example, if no -i is given, why not attach to all ethernet ports?
Just like just about every daemon out there.

As another example, detecting if hardware timestamping is possible is
trivial. Then the defaults can be automatically adjusted. If hardware
timestamping is possible, spawn()ing or pthreading ph2sys should be
easy. Else if hardware timestamping is not possible then just fall back
to software timestamping.

But for now, I'll be happy if I can get a working HOWTO that is not
10 pages.

RGDS
GARY
---------------------------------------------------------------------------
Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97701
***@rellim.com Tel:+1(541)382-8588
Keller, Jacob E
2015-02-24 00:38:12 UTC
Permalink
Hello,

I'm going to attempt to give you a cleaner set of instructions on how to
enable NTPSHM servo.

I assume that eth0 is your NIC, and that eth0 has hardware timestamping
support. I am going to use linreg servo for ptp4l, and ntpshm servo for
phc2sys.

Note this uses default hardware timestamping, E2E, over ipv4

$cat config.cfg
[global]
clock_servo linreg
uds_address /var/run/ptp4l

#background the process
$ptp4l -f config.cfg &
OR (for testing)
#forground the process and print to stdout
$ptp4l -f config.cfg -m

Now, after this starts, I can start phc2sys in automatic mode. Note that
automatic mode does not mean all configuration is automatic (contrary to
its name). What it means is that port configuration is automatically
pulled from ptp4l and it follows state changes in ptp4l. Thus, to get
phc2sys to run with NTPSHM servo

-a for automatic,
-r for system time
-E ntpshm

and again -m to log to stdout

$phc2sys -a -r -E ntpshm -m

I put the following refclock in my chrony.conf

refclock SHM 0 poll 3 refid PTP0

Doing these steps works fine for me. I have more comments below, but
hopefully this outline will help you solve your issue.

I believe the key insight is that you tried to configure ptp4l as the
NTP SHM instead of phc2sys as the NTP SHM
Post by Gary E. Miller
Yo Jacob E!
On Mon, 23 Feb 2015 22:40:17 +0000
Post by Keller, Jacob E
Post by Gary E. Miller
Not having chronyd or ntpd in the loop is a total non-starter for me.
I think I was mis-leading in my original comment here. I meant to say
that accuracy of PTP vs NTP over WiFi could very well be different
simply because they are very different algorithms with somewhat
related but differing goals. I did not mean to say that a full
solution would not use ntpd or chronyd or similar.
Fair enough.
Post by Keller, Jacob E
Post by Gary E. Miller
Since ptp4l can control the ntpshm in software timestamping mode why
can it not do so in hardware timestamping mode? Does ptp4l in
hardware mode ignore ntpsm in hardware mode, or does it continue to
post parallel software mode timing?
This is a complex problem. Hardware timestamping is done by varying
hardware at the MAC (or PHY) level. Each hardware has its own internal
clock to do this. These clocks are in *no* way driven by or in sync
with the kernel clocks. Thus, the ptp kernel core exposes these each
as their own clock device /dev/ptpX
Understood.
Post by Keller, Jacob E
ptp4l in hardware mode completely ignores anything related to the
software kernel clock world. This is because hardware NICs have their
own internal clocks for doing hardware timestamping. Thus all
timestamps are relative to the clock on the MAC, not to any clock on
the motherboard.
Are you saying options like -M are ignored in hardware timestamp mode?
-M is a phc2sys switch. It specifically modifies ntpshm servo's
settings. It is not a command line option for ptp4l, but for ptp4l, we
use a configuration file. In this case, ntpshm is a servo which is
supposed to control the clock.
Post by Gary E. Miller
If so should not -M be an illegal option in hardware mode?
phc2sys doesn't have "software" and "hardware" modes as its job is
purely to do clock syntonization/synchronization.
Post by Gary E. Miller
Post by Keller, Jacob E
There have been attempts in the past for hardware to "sync" to the
kernel clock but these were almost all faulty for various reasons
including that they can't run a full servo in the kernel.
I would never suggest a thing. That is a job for ntpd and some sort
of refclock shimm (maybe hpc2sys).
Post by Keller, Jacob E
There really doesn't make any sense for ptp4l to attempt to relate
them to the software kernel clock at all.
Nor would I suggest such a thing. If possible, both the hardware
and software timestamps should be independent and both accessible.
They are, though ptp4l doesn't use both at the same time.
Post by Gary E. Miller
Post by Keller, Jacob E
Thus, ptp4l will drive the
ptp hardware clock from PTP on the network. Then a separate program
(phc2sys or other) drives the software->hardware sync.
Lost me. Whether you partition a thing into two programs, or one
program with two threads, or some other combo does not follow from your
argument.
It wasn't an argument. I was explaining how it works.
Post by Gary E. Miller
Conceptually, and I suggest not seriously (yet), hpc2sys could easily
just be a thread launched inside ptp4l when in hardware mode.
It theoretically could. It was, however, designed as a separate program.
I do not know specific reasons for this.
Post by Gary E. Miller
Post by Keller, Jacob E
*however* when in software timestamp mode the origin of the timestamps
*is* the kernel clock, so thus we drive the kernel timer from ptp4l
directly.
Uh, no. Not in my case, I have ptp4l driving ptp4l driving ntpd
which drives the kernel clock. This is the point of this config
clock_servo ntpshm
That seems wrong. You should only have one instance of ptp4l? I assume
that was a typo?

In your case, you have ptp4l driving ntpd, via ntpshm yes.

For hardware timestamp modes what you want to do is:

ptp4l using linreg or pi servo, driving the hardware clock

phc2sys using servo ntpshm via the "-E" switch, and -M switch to set
NTPSHM segment information.

By default phc2sys uses PI servo.
Post by Gary E. Miller
Post by Keller, Jacob E
In hardware mode we are not driving the clock directly but only
indirectly.
Unclear who 'we' is.
Sorry, "we" is ptp4l. To clarify, when using hardware timestamps ptp4l
does not drive the system clock directly. It does in software mode, but
only because there is no other clock to drive.
Post by Gary E. Miller
Post by Keller, Jacob E
I hope this makes sense?
Nope.
Post by Keller, Jacob E
Post by Gary E. Miller
Post by Keller, Jacob E
This is sort of the flow.
network -> ptp4l -> NIC hardware clock -> phc2sys -> system clock
So how do I get ntpshm in there?
I assume you tell phc2sys to use ntpshm?
Well, I am trying, and failing. Automatic mode fails flagrantly and
any manual mode I try either fails, does nothing, or totally flalis my
system clock.
Post by Keller, Jacob E
Post by Gary E. Miller
Weird. Why are not ptp4l and phc2sys one program? They are
certainly deeply intertwined. And ptp4l in software mode already
does most of what phc2sys does in hardware mode.
I'll leave Richard and others to answer this question, though I
suspect because ptp4l controls only the one clock which is handling
timestamping. It just happens that in software mode this is the kernel
clock.
And ptp4l sends the acquired timestamps to ntpshm in software mode. Seems
to me it just needs to instead read the hardware timestamp and sent it?
It can't send hardware timestamp directly. This is because hardware
timestamps are relative to the MAC internal clock which has zero basis
for comparison to the system clock (ie: they aren't running at the same
rate and definitely can't be compared as if they're in the same domain).
If you sent hardware timestamps directly to NTPSHM the result would be
very bad.
Post by Gary E. Miller
Or ptp4l could spawn htc2sys when hardware mode is selected? Or
-a mode could just work?
Theoretically, ptp4l could spawn phc2sys. Again, I'd defer to Richard on
why it doesn't today. Most likely your answer will be "Patches welcome".
Post by Gary E. Miller
Post by Keller, Jacob E
Post by Gary E. Miller
Post by Keller, Jacob E
Post by Gary E. Miller
kong ~ # ptp4l
no interface specified
ptp4l does not by default look up any configuration file.
Understood. Another reason why I have only been runnning in default
mode when asked to do that, to prove that default mode does not work.
I only mentioned you had to pass -f because you said you ran "ptp4l"
on its own without passing the -f parameter. This means anything in
your configuration wouldn't be applied.
Understood. I'm trying everything suggested, no matter how odd, because
nothing seems to work.
I'll try anything you suggest, as long as it brings ntpshm into the
picture.
Post by Keller, Jacob E
Post by Gary E. Miller
Why not? hdparm implements the SATA spec at low and high levels.
Ditto sdparm for SCSI and stty for RS-232.
I mean to say that I am unsure if pmc should be the "hdparm"
equivalent. I am not sure the management protocol is exactly what you
are looking for here. Maybe it is.
Pending a better idea, and someone to code it, I'll stick to this one.
It would be pretty simple to code.
gpsd ended up with several tools for real time data visualization. If
someone were to write a pmc-like gui that real time displayed the PTP
data I would love it. The human brain is amazing at seeing patterns in
visual data. But I do not want to suggest work to volunteers beyond
what I need for basic functionality.
Post by Keller, Jacob E
Post by Gary E. Miller
This stuff needs to be de-mystified nand automated if real users are
going to use it. RedHat has 10 pages of HOWTO and they still don't
get it right.
PTP is quite complicated, so this is not entirely unsurprising.
Remember all the knobs that used to be on TVs and Oscopes? They were
really complicated! But engineers fixed that.
GPSs and file systems are also complicated. But in 99% of the case I
can just say "gps -n /dev/ttyS0" or "mount /dev/sda /mnt/hd" and things
just work. I enjoy being in the 1% that flips every switch, but we
should not inflict that on the 99%.
For example, if no -i is given, why not attach to all ethernet ports?
Just like just about every daemon out there.
As another example, detecting if hardware timestamping is possible is
trivial. Then the defaults can be automatically adjusted. If hardware
timestamping is possible, spawn()ing or pthreading ph2sys should be
easy. Else if hardware timestamping is not possible then just fall back
to software timestamping.
But for now, I'll be happy if I can get a working HOWTO that is not
10 pages.
RGDS
GARY
---------------------------------------------------------------------------
Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97701
Gary E. Miller
2015-02-24 01:44:45 UTC
Permalink
Yo Jacob E!

On Tue, 24 Feb 2015 00:38:12 +0000
Post by Keller, Jacob E
I'm going to attempt to give you a cleaner set of instructions on how
to enable NTPSHM servo.
Thank you.
Post by Keller, Jacob E
I assume that eth0 is your NIC, and that eth0 has hardware
timestamping support.
Yup. Seemingly so:

kong ~ # ethtool -T eth0
Time stamping parameters for eth0:
Capabilities:
hardware-transmit (SOF_TIMESTAMPING_TX_HARDWARE)
software-transmit (SOF_TIMESTAMPING_TX_SOFTWARE)
hardware-receive (SOF_TIMESTAMPING_RX_HARDWARE)
software-receive (SOF_TIMESTAMPING_RX_SOFTWARE)
software-system-clock (SOF_TIMESTAMPING_SOFTWARE)
hardware-raw-clock (SOF_TIMESTAMPING_RAW_HARDWARE)
PTP Hardware Clock: 0
Hardware Transmit Timestamp Modes:
off (HWTSTAMP_TX_OFF)
on (HWTSTAMP_TX_ON)
Hardware Receive Filter Modes:
none (HWTSTAMP_FILTER_NONE)
all (HWTSTAMP_FILTER_ALL)
ptpv1-l4-sync (HWTSTAMP_FILTER_PTP_V1_L4_SYNC)
ptpv1-l4-delay-req (HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ)
ptpv2-l4-sync (HWTSTAMP_FILTER_PTP_V2_L4_SYNC)
ptpv2-l4-delay-req (HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ)
ptpv2-l2-sync (HWTSTAMP_FILTER_PTP_V2_L2_SYNC)
ptpv2-l2-delay-req (HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ)
ptpv2-event (HWTSTAMP_FILTER_PTP_V2_EVENT)
ptpv2-sync (HWTSTAMP_FILTER_PTP_V2_SYNC)
ptpv2-delay-req (HWTSTAMP_FILTER_PTP_V2_DELAY_REQ)

Any idea what the minimum possible to support hardware timestamp mode
would be?
Post by Keller, Jacob E
I am going to use linreg servo for ptp4l, and
ntpshm servo for phc2sys.
I can't find much doc on what linreg is/does. Can you confirm it does not
ever write to sysclock? Looks that way to me, in which case seems lotta
work for little...
Post by Keller, Jacob E
Note this uses default hardware timestamping, E2E, over ipv4
Not bad defaults, but as a rule (RFC) IPv6 is to be prefered over IPv4
when available.
Post by Keller, Jacob E
$cat config.cfg
[global]
clock_servo linreg
uds_address /var/run/ptp4l
uds_address is the default socket, so not needed?
Post by Keller, Jacob E
#background the process
$ptp4l -f config.cfg &
OR (for testing)
#forground the process and print to stdout
$ptp4l -f config.cfg -m
Now, after this starts, I can start phc2sys in automatic mode. Note
that automatic mode does not mean all configuration is automatic
(contrary to its name). What it means is that port configuration is
automatically pulled from ptp4l and it follows state changes in
ptp4l. Thus, to get phc2sys to run with NTPSHM servo
-a for automatic,
-r for system time
-E ntpshm
and again -m to log to stdout
$phc2sys -a -r -E ntpshm -m
Plus "-M 2", since my ntp already has 2 SHMs.
Post by Keller, Jacob E
I put the following refclock in my chrony.conf
refclock SHM 0 poll 3 refid PTP0
In my case:

refclock SHM 2 poll 3 refid PTP0
Post by Keller, Jacob E
Doing these steps works fine for me. I have more comments below, but
hopefully this outline will help you solve your issue.
Well, the results were ugly, I stopped it when it tried to servo
my clock 35183 seconds.

kong ~ # phc2sys -a -r -E ntpshm -m -M 2
phc2sys[102865.368]: reconfiguring after port state change
phc2sys[102865.368]: selecting CLOCK_REALTIME for synchronization
phc2sys[102865.368]: selecting eth0 as the master clock
phc2sys[102865.368]: phc offset -38203 s0 freq +0 delay 1520
phc2sys[102866.368]: phc offset -38129 s0 freq +0 delay 1507
phc2sys[102867.368]: phc offset -38041 s0 freq +0 delay 1517
phc2sys[102868.368]: phc offset -37890 s0 freq +0 delay 1516
phc2sys[102869.368]: phc offset -37718 s0 freq +0 delay 1516
phc2sys[102870.369]: phc offset -37629 s0 freq +0 delay 1516
phc2sys[102871.369]: phc offset -37547 s0 freq +0 delay 1517
phc2sys[102872.369]: phc offset -37502 s0 freq +0 delay 1580
phc2sys[102873.369]: phc offset -37342 s0 freq +0 delay 1579
phc2sys[102874.370]: phc offset -37164 s0 freq +0 delay 1642
phc2sys[102875.370]: phc offset -36963 s0 freq +0 delay 1516
phc2sys[102876.370]: phc offset -36865 s0 freq +0 delay 1517
phc2sys[102877.370]: phc offset -36724 s0 freq +0 delay 1580
phc2sys[102878.370]: phc offset -36511 s0 freq +0 delay 1516
phc2sys[102879.371]: phc offset -36326 s0 freq +0 delay 1516
phc2sys[102880.371]: phc offset -36150 s0 freq +0 delay 1516
phc2sys[102881.371]: phc offset -70368744213708 s0 freq +0 delay 1517
phc2sys[102882.371]: port 002590.fffe.f355da-1 changed state
phc2sys[102882.371]: reconfiguring after port state change
phc2sys[102882.371]: master clock not ready, waiting...
phc2sys[102885.372]: port 002590.fffe.f355da-1 changed state
phc2sys[102885.372]: reconfiguring after port state change
phc2sys[102885.372]: selecting CLOCK_REALTIME for synchronization
phc2sys[102885.372]: selecting eth0 as the master clock
phc2sys[102885.372]: phc offset -70368225388909 s0 freq +0 delay 1516
phc2sys[102886.372]: phc offset -70367625256973 s0 freq +0 delay 1792
phc2sys[102887.372]: phc offset -70367025106507 s0 freq +0 delay 1517
phc2sys[102888.373]: phc offset -70366424969228 s0 freq +0 delay 1890
phc2sys[102889.373]: phc offset -70365824785558 s0 freq +0 delay 1516
^Cphc2sys[102889.479]: phc offset -70365760953346 s0 freq +0 delay 1516
kong ~ #

And why is it saying anything about CLOCK_REALTIME? That is the time input
not output?
Post by Keller, Jacob E
I believe the key insight is that you tried to configure ptp4l as the
NTP SHM instead of phc2sys as the NTP SHM
Could be, more insights needed.
Post by Keller, Jacob E
Post by Gary E. Miller
If so should not -M be an illegal option in hardware mode?
phc2sys doesn't have "software" and "hardware" modes as its job is
purely to do clock syntonization/synchronization.
Odd, then why do I not need to run ptp2sys in software only mode? I
know for a fact I only need ptp2sys in hardware mode. If phc2sys can
run in software mode why is ptp4l touching my clock/ntpshm at all??
Post by Keller, Jacob E
Post by Gary E. Miller
Nor would I suggest such a thing. If possible, both the hardware
and software timestamps should be independent and both accessible.
They are, though ptp4l doesn't use both at the same time.
A pity. We learn a lot in gpsd running serial (sentence) and hardware
(PPS) time in parallel.
Post by Keller, Jacob E
Post by Gary E. Miller
Post by Keller, Jacob E
Thus, ptp4l will drive the
ptp hardware clock from PTP on the network. Then a separate
program (phc2sys or other) drives the software->hardware sync.
Lost me. Whether you partition a thing into two programs, or one
program with two threads, or some other combo does not follow from
your argument.
It wasn't an argument. I was explaining how it works.
Fair enough. At some point I gotta find out who to have that
argument with.
Post by Keller, Jacob E
Post by Gary E. Miller
Post by Keller, Jacob E
*however* when in software timestamp mode the origin of the
timestamps *is* the kernel clock, so thus we drive the kernel
timer from ptp4l directly.
Uh, no. Not in my case, I have ptp4l driving ptp4l driving ntpd
which drives the kernel clock. This is the point of this config
clock_servo ntpshm
That seems wrong. You should only have one instance of ptp4l? I assume
that was a typo?
Sorry, yes my bad. ptp4l driving chrony/ntpd which drives the kernel
clock.
Post by Keller, Jacob E
In your case, you have ptp4l driving ntpd, via ntpshm yes.
In software mode. But you are telling me to have phc2sys drive ntpshm
instead in hardware mode.
Post by Keller, Jacob E
ptp4l using linreg or pi servo, driving the hardware clock
phc2sys using servo ntpshm via the "-E" switch, and -M switch to set
NTPSHM segment information.
Like this?

kong ~ # phc2sys -a -r -E ntpshm -m -M 2

Except that fails...
Post by Keller, Jacob E
Post by Gary E. Miller
Post by Keller, Jacob E
In hardware mode we are not driving the clock directly but only
indirectly.
Unclear who 'we' is.
Sorry, "we" is ptp4l. To clarify, when using hardware timestamps ptp4l
does not drive the system clock directly. It does in software mode,
but only because there is no other clock to drive.
Or, in mmy case, driving ntpd/chronyd which is driving the system clock.
Post by Keller, Jacob E
Post by Gary E. Miller
And ptp4l sends the acquired timestamps to ntpshm in software
mode. Seems to me it just needs to instead read the hardware
timestamp and sent it?
It can't send hardware timestamp directly. This is because hardware
timestamps are relative to the MAC internal clock which has zero basis
for comparison to the system clock (ie: they aren't running at the
same rate and definitely can't be compared as if they're in the same
domain). If you sent hardware timestamps directly to NTPSHM the
result would be very bad.
OK, send the hardware timestamp to NTPSHM after the proper conversion
process is performed. I never assumed no translation required.
Post by Keller, Jacob E
Post by Gary E. Miller
Or ptp4l could spawn htc2sys when hardware mode is selected? Or
-a mode could just work?
Theoretically, ptp4l could spawn phc2sys. Again, I'd defer to Richard
on why it doesn't today. Most likely your answer will be "Patches
welcome".
I gotta get it to work once first. Thanks for your help and I'll
kkeep trying.

RGDS
GARY
---------------------------------------------------------------------------
Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97701
***@rellim.com Tel:+1(541)382-8588
Gary E. Miller
2015-02-24 02:48:51 UTC
Permalink
Yo Jacob!

Just to summarize what I just tried, that failed. I repeated several times,
similar results, it just took varying times before going crazy, usually
10 to 90 seconds.

Here is my hardware:

kong ~ # ethtool -T eth0
Time stamping parameters for eth0:
Capabilities:
hardware-transmit (SOF_TIMESTAMPING_TX_HARDWARE)
software-transmit (SOF_TIMESTAMPING_TX_SOFTWARE)
hardware-receive (SOF_TIMESTAMPING_RX_HARDWARE)
software-receive (SOF_TIMESTAMPING_RX_SOFTWARE)
software-system-clock (SOF_TIMESTAMPING_SOFTWARE)
hardware-raw-clock (SOF_TIMESTAMPING_RAW_HARDWARE)
PTP Hardware Clock: 0
Hardware Transmit Timestamp Modes:
off (HWTSTAMP_TX_OFF)
on (HWTSTAMP_TX_ON)
Hardware Receive Filter Modes:
none (HWTSTAMP_FILTER_NONE)
all (HWTSTAMP_FILTER_ALL)
ptpv1-l4-sync (HWTSTAMP_FILTER_PTP_V1_L4_SYNC)
ptpv1-l4-delay-req (HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ)
ptpv2-l4-sync (HWTSTAMP_FILTER_PTP_V2_L4_SYNC)
ptpv2-l4-delay-req (HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ)
ptpv2-l2-sync (HWTSTAMP_FILTER_PTP_V2_L2_SYNC)
ptpv2-l2-delay-req (HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ)
ptpv2-event (HWTSTAMP_FILTER_PTP_V2_EVENT)
ptpv2-sync (HWTSTAMP_FILTER_PTP_V2_SYNC)
ptpv2-delay-req (HWTSTAMP_FILTER_PTP_V2_DELAY_REQ)

Here I make sure no conflicting daemons:

kong ~ # killall ptp4l phc2sys
ptp4l: no process found
phc2sys: no process found
kong ~ # killall ptp4l phc2sys
ptp4l: no process found
phc2sys: no process found

Here is my config:

kong ~ # cat ptp.conf
[global]
clock_servo linreg

Start ptp4l:

kong ~ # ptp4l -i eth0 -l 7 -m -f ptp.conf &

And drop the bomb:

kong ~ # phc2sys -a -r -E ntpshm -m -M 2
phc2sys[354.145]: uds: sendto failed: No such file or directory

This one is odd, is uds_address not defaulted as documented?

Sadly, add uds_address /var/run/ptp4l to my ptp.conf does not change
anything.

ptp4l[354.146]: selected /dev/ptp0 as PTP clock
ptp4l[354.183]: port 1: INITIALIZING to LISTENING on INITIALIZE
ptp4l[354.183]: port 0: INITIALIZING to LISTENING on INITIALIZE
ptp4l[354.570]: port 1: setting asCapable
phc2sys[355.146]: Waiting for ptp4l...
ptp4l[355.197]: port 0: setting asCapable
ptp4l[355.674]: port 1: new foreign master 003048.fffe.345fe2-1
phc2sys[356.198]: reconfiguring after port state change
phc2sys[356.198]: selecting eth0 for synchronization
phc2sys[356.198]: nothing to synchronize
ptp4l[359.341]: selected best master clock 003048.fffe.345fe2
ptp4l[359.341]: foreign master not using PTP timescale
ptp4l[359.341]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE
ptp4l[359.361]: port 1: delay timeout
ptp4l[359.526]: port 1: delay timeout
phc2sys[360.198]: port 002590.fffe.f355da-1 changed state
phc2sys[360.198]: reconfiguring after port state change
phc2sys[360.198]: master clock not ready, waiting...
ptp4l[360.352]: port 1: delay timeout
ptp4l[360.353]: path delay 58263 58263
ptp4l[360.987]: master offset 16506354212 s0 freq -0 path delay 58263
ptp4l[361.746]: port 1: delay timeout
ptp4l[361.746]: path delay 59135 60008
ptp4l[361.904]: master offset 16506344102 s0 freq -0 path delay 59135
ptp4l[362.820]: master offset 16506334623 s0 freq -0 path delay 59135
ptp4l[363.681]: port 1: delay timeout
ptp4l[363.681]: path delay 58263 47033
ptp4l[363.737]: linreg: points 4 slope 1.000008825 intercept -16506326986 err 0
ptp4l[363.737]: master offset 16506327955 s1 freq -9794 path delay 58263
ptp4l[364.654]: linreg: points 4 slope 1.000008873 intercept 613 err 1412
ptp4l[364.654]: master offset -1412 s2 freq -9485 path delay 58263
ptp4l[364.654]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
phc2sys[365.199]: port 002590.fffe.f355da-1 changed state
phc2sys[365.199]: reconfiguring after port state change
phc2sys[365.199]: selecting CLOCK_REALTIME for synchronization
phc2sys[365.199]: selecting eth0 as the master clock
phc2sys[365.199]: phc offset -70353239245525 s0 freq +0 delay 1348

WTF was that???

ptp4l[365.571]: clockcheck: clock jumped forward or running faster than expected!
ptp4l[365.571]: master offset 70368744176888 s0 freq -9485 path delay 58263
ptp4l[365.571]: port 1: SLAVE to UNCALIBRATED on SYNCHRONIZATION_FAULT
ptp4l[366.164]: port 1: delay timeout
ptp4l[366.164]: path delay 58096 57929
phc2sys[366.199]: port 002590.fffe.f355da-1 changed state
phc2sys[366.199]: reconfiguring after port state change
phc2sys[366.199]: master clock not ready, waiting...
ptp4l[366.487]: master offset 70368744178475 s0 freq -9485 path delay 58096
ptp4l[367.404]: master offset 70368744179257 s0 freq -9485 path delay 58096
ptp4l[367.495]: port 1: delay timeout
ptp4l[367.495]: path delay 58263 61062
ptp4l[368.321]: linreg: points 4 slope 1.000008584 intercept -70368744179915 err 1412
ptp4l[368.321]: master offset 70368744179632 s2 freq +599999999 path delay 58263
ptp4l[368.321]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
ptp4l[368.810]: port 1: delay timeout
ptp4l[368.810]: negative path delay -69667
ptp4l[368.810]: path_delay = (t2 - t3) * rr + (t4 - t1) - (c1 + c2 + c3)
ptp4l[368.810]: t2 - t3 = -213724133
ptp4l[368.810]: t4 - t1 = +534175583
ptp4l[368.810]: rr = 2.500021454
ptp4l[368.810]: c1 0
ptp4l[368.810]: c2 0
ptp4l[368.810]: c3 0
ptp4l[368.810]: path delay 58096 -69667
ptp4l[369.140]: port 1: delay timeout
ptp4l[369.140]: negative path delay -60229
ptp4l[369.140]: path_delay = (t2 - t3) * rr + (t4 - t1) - (c1 + c2 + c3)
ptp4l[369.140]: t2 - t3 = -357590119
ptp4l[369.140]: t4 - t1 = +893862511
ptp4l[369.140]: rr = 2.500021454
ptp4l[369.140]: c1 0
ptp4l[369.140]: c2 0
ptp4l[369.140]: c3 0
ptp4l[369.140]: path delay 57929 -60229
phc2sys[369.199]: port 002590.fffe.f355da-1 changed state
phc2sys[369.199]: reconfiguring after port state change
phc2sys[369.199]: selecting CLOCK_REALTIME for synchronization
phc2sys[369.199]: selecting eth0 as the master clock
phc2sys[369.199]: phc offset -70353027879345 s0 freq +0 delay 1391
ptp4l[369.237]: linreg: points 4 slope 0.999941194 intercept -70368144182166 err 225182
ptp4l[369.237]: master offset 70368144249873 s2 freq +599999999 path delay 57929
ptp4l[370.073]: port 1: delay timeout
ptp4l[370.073]: path delay 58096 71844
ptp4l[370.154]: linreg: points 4 slope 0.999918331 intercept -70367544242678 err 113137
ptp4l[370.154]: master offset 70367544220469 s2 freq +599999999 path delay 58096
phc2sys[370.199]: phc offset -70352464198801 s0 freq +0 delay 1345
ptp4l[371.071]: linreg: points 4 slope 0.999940831 intercept -70366944235752 err 113128
ptp4l[371.071]: master offset 70366944190386 s2 freq +599999999 path delay 58096
phc2sys[371.199]: phc offset -70351900518690 s0 freq +0 delay 1379
ptp4l[371.706]: port 1: delay timeout
ptp4l[371.706]: path delay 58263 80317
ptp4l[371.987]: linreg: points 4 slope 1.000008261 intercept -70366344169510 err 112836
ptp4l[371.987]: master offset 70366344169768 s2 freq +599999999 path delay 58263
phc2sys[372.200]: phc offset -70351336831824 s0 freq +0 delay 1347
ptp4l[372.904]: linreg: points 4 slope 1.000008717 intercept -70365744150943 err 90551
ptp4l[372.904]: master offset 70365744150552 s2 freq +599999999 path delay 58263
phc2sys[373.200]: phc offset -70350773141966 s0 freq +0 delay 1388
ptp4l[373.617]: port 1: delay timeout
ptp4l[373.617]: path delay 58103 57944
ptp4l[373.821]: linreg: points 4 slope 1.000008674 intercept -70365144126319 err 75472
ptp4l[373.821]: master offset 70365144126321 s2 freq +599999999 path delay 58103
phc2sys[374.200]: phc offset -70350209474225 s0 freq +0 delay 1348
ptp4l[374.738]: linreg: points 4 slope 1.000008812 intercept -70364544099292 err 64723
ptp4l[374.738]: master offset 70364544099571 s2 freq +599999999 path delay 58103
ptp4l[374.831]: port 1: delay timeout
ptp4l[374.831]: path delay 58976 62467
phc2sys[375.200]: phc offset -70349645810969 s0 freq +0 delay 1347
ptp4l[375.654]: linreg: points 4 slope 1.000008650 intercept -70363944062902 err 56663
ptp4l[375.654]: master offset 70363944062598 s2 freq +599999999 path delay 58976
ptp4l[375.923]: port 1: delay timeout
ptp4l[375.923]: path delay 57936 53829
phc2sys[376.200]: phc offset -70349082116188 s0 freq +0 delay 1460
ptp4l[376.571]: linreg: points 4 slope 1.000008152 intercept -70363344039615 err 50595
ptp4l[376.571]: master offset 70363344040348 s2 freq +599999999 path delay 57936
phc2sys[377.200]: phc offset -70348518427659 s0 freq +0 delay 1343
ptp4l[377.488]: linreg: points 4 slope 1.000007828 intercept -70362744024915 err 45586
ptp4l[377.488]: master offset 70362744024897 s2 freq +599999999 path delay 57936
ptp4l[377.807]: port 1: delay timeout
ptp4l[377.807]: path delay 59445 60946
phc2sys[378.200]: phc offset -70347954751340 s0 freq +0 delay 1353
ptp4l[378.404]: linreg: points 4 slope 1.000008160 intercept -70362143984809 err 44713
ptp4l[378.404]: master offset 70362143983983 s2 freq +599999999 path delay 59445
phc2sys[379.200]: phc offset -70347391089799 s0 freq +0 delay 1388
ptp4l[379.321]: linreg: points 4 slope 1.000009185 intercept -70361543971115 err 43857
ptp4l[379.321]: master offset 70361543970987 s2 freq +599999999 path delay 59445
ptp4l[379.707]: port 1: delay timeout
ptp4l[379.707]: path delay 59445 42090
phc2sys[380.200]: phc offset -70346827428315 s0 freq +0 delay 1462
ptp4l[380.238]: linreg: points 4 slope 1.000009091 intercept -70360943944337 err 42997
ptp4l[380.238]: master offset 70360943944743 s2 freq +599999999 path delay 59445
ptp4l[380.465]: port 1: delay timeout
ptp4l[380.465]: path delay 55886 46490
ptp4l[381.155]: linreg: points 8 slope 1.000008246 intercept -70360343906082 err 33333
ptp4l[381.155]: master offset 70360343908136 s2 freq +599999999 path delay 55886
phc2sys[381.200]: phc offset -70346263731002 s0 freq +0 delay 1463
ptp4l[381.758]: port 1: delay timeout
ptp4l[381.759]: path delay 55886 45742
ptp4l[382.071]: linreg: points 8 slope 1.000008035 intercept -70359743870239 err 32710
ptp4l[382.071]: master offset 70359743871471 s2 freq +599999999 path delay 55886
phc2sys[382.200]: phc offset -70345700043849 s0 freq +0 delay 1464
ptp4l[382.451]: port 1: delay timeout
ptp4l[382.452]: path delay 56747 55551
ptp4l[382.988]: linreg: points 8 slope 1.000008136 intercept -70359143861734 err 32071
ptp4l[382.988]: master offset 70359143861383 s2 freq +599999999 path delay 56747
phc2sys[383.200]: phc offset -70345136355504 s0 freq +0 delay 1461
ptp4l[383.905]: linreg: points 8 slope 1.000008152 intercept -70358543843295 err 31462
ptp4l[383.905]: master offset 70358543842094 s2 freq +599999999 path delay 56747
ptp4l[384.188]: port 1: delay timeout
ptp4l[384.188]: path delay 54690 39523
phc2sys[384.200]: phc offset -70344572667058 s0 freq +0 delay 1350
ptp4l[384.313]: port 1: delay timeout
ptp4l[384.314]: path delay 53598 53368
ptp4l[384.821]: linreg: points 8 slope 1.000007880 intercept -70357943816603 err 30869
ptp4l[384.821]: master offset 70357943817385 s2 freq +599999999 path delay 53598
ptp4l[385.156]: port 1: delay timeout
ptp4l[385.156]: path delay 53598 57616
phc2sys[385.200]: phc offset -70344008980381 s0 freq +0 delay 1464
ptp4l[385.738]: linreg: points 8 slope 1.000007783 intercept -70357343791409 err 30264
ptp4l[385.738]: master offset 70357343791685 s2 freq +599999999 path delay 53598
phc2sys[386.201]: phc offset -70343445298408 s0 freq +0 delay 1466
ptp4l[386.655]: linreg: points 8 slope 1.000007932 intercept -70356743766393 err 29678
ptp4l[386.655]: master offset 70356743766013 s2 freq +599999999 path delay 53598
ptp4l[386.967]: port 1: delay timeout
ptp4l[386.968]: path delay 51744 50120
phc2sys[387.201]: phc offset -70342881625905 s0 freq +0 delay 1352
ptp4l[387.571]: linreg: points 8 slope 1.000007910 intercept -70356143740286 err 29130
ptp4l[387.571]: master offset 70356143742033 s2 freq +599999999 path delay 51744

I stopped it here as it tried to step my good clock by -70356s.

RGDS
GARY
---------------------------------------------------------------------------
Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97701
***@rellim.com Tel:+1(541)382-8588
Gary E. Miller
2015-02-24 08:28:01 UTC
Permalink
Yo Miroslav!

On Tue, 24 Feb 2015 08:24:00 +0100
Hm. Are you running 1.5 or current git? I'm not sure if it could
explain this, but there was a bug in 1.5 affecting the shm servo.
Oh, one other thing. Sometimes after running a timestamp hardware
test I can not revert to timestamp software and get a good time.

After my last test I had a persistent 150mS offset from ptp4l that
would not go away. Killing and restarting ptp4l did not help. I
had to reboot to get back to good time.

Another reason to use PTP with ntpshm. I bet this has been happening
but no one noticed since they did not have other refclocks.

RGDS
GARY
---------------------------------------------------------------------------
Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97701
***@rellim.com Tel:+1(541)382-8588
Jiri Benc
2015-02-24 09:07:11 UTC
Permalink
Post by Gary E. Miller
ptp4l[364.654]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
phc2sys[365.199]: port 002590.fffe.f355da-1 changed state
phc2sys[365.199]: reconfiguring after port state change
phc2sys[365.199]: selecting CLOCK_REALTIME for synchronization
phc2sys[365.199]: selecting eth0 as the master clock
phc2sys[365.199]: phc offset -70353239245525 s0 freq +0 delay 1348
WTF was that???
ptp4l[365.571]: clockcheck: clock jumped forward or running faster than expected!
ptp4l[365.571]: master offset 70368744176888 s0 freq -9485 path delay 58263
You are apparently running several programs that try to drive the same
clock. This is going to fail, no matter what software you use. This is
most likely a misconfiguration on your side, perhaps you're running
multiple ptp4l instances over the same net interface.

On a similar note, earlier in the thread, you mentioned that you want
the realtime (system) clock to be set by other program than phc2sys. You
must not tell phc2sys to drive the system clock, then, otherwise those
two programs would fight each other. This means you must not pass "-r"
to phc2sys, that option tells phc2sys to drive the system clock (please
do read the phc2sys man page before asking more questions about this,
thanks).

Now, with a single interface and no system clock to sync (i.e. just
phc2sys -a), there's just one clock (the internal clock of the NIC).
phc2sys does synchronization of two or more clocks. If it has just one
clock, it has nothing to synchronize it with and as the consequence, it
does nothing.

As ntpshm is implemented as a servo, it does not come into the game at
all. phc2sys has nothing to synchronize, as synchronizing one clock is a
no-op, and thus does nothing. Using manual mode instead of the
automatic one won't change this.

What's needed is implementing ntpshm to be a clock, not a servo.

tl;dr: What you're trying to achieve does not work with linuxptp
currently.

Jiri
--
Jiri Benc
Gary E. Miller
2015-02-24 22:13:10 UTC
Permalink
Yo Jacob E!

On Tue, 24 Feb 2015 21:33:49 +0000
Post by Gary E. Miller
# killall ptp4l phc2sys
This is probably the root of your problem. Nothing else *should* be
writing the clock but any software that has privileges and access
to /dev/ptpX could write to it..
What else *could* be writing /dev/ptpX. I have never installed ptpd or
anything similar.
Also your driver can do it in case of a reset.
I've not noticed any port resets.
Could you post your dmesg log? This is an Intel part that I have some
driver experience with so maybe I can spot any inconsistency there.
Here you go, first the debug output:


kong ~ # killall ptp4l phc2sys
ptp4l: no process found
phc2sys: no process found
kong ~ # killall ptp4l phc2sys
ptp4l: no process found
phc2sys: no process found
kong ~ # cat ptp.conf
[global]
clock_servo linreg
uds_address /var/run/ptp4l

kong ~ # ptp4l -i eth0 -l 7 -m -f ptp.conf &
kong ~ # sleep 3
ptp4l[48368.046]: selected /dev/ptp0 as PTP clock
ptp4l[48368.046]: port 1: INITIALIZING to LISTENING on INITIALIZE
ptp4l[48368.046]: port 0: INITIALIZING to LISTENING on INITIALIZE
ptp4l[48368.406]: port 1: setting asCapable
ptp4l[48368.406]: port 1: new foreign master 003048.fffe.345fe2-1
kong ~ # phc2sys -a -r -E ntpshm -m -M 2
ptp4l[48371.047]: port 0: setting asCapable
phc2sys[48372.047]: reconfiguring after port state change
phc2sys[48372.047]: selecting eth0 for synchronization
phc2sys[48372.047]: nothing to synchronize
ptp4l[48372.406]: selected best master clock 003048.fffe.345fe2
ptp4l[48372.406]: foreign master not using PTP timescale
ptp4l[48372.406]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE
phc2sys[48373.048]: port 002590.fffe.f355da-1 changed state
phc2sys[48373.048]: reconfiguring after port state change
phc2sys[48373.048]: master clock not ready, waiting...
ptp4l[48373.473]: port 1: delay timeout
ptp4l[48373.474]: path delay 54411 54411
ptp4l[48374.419]: master offset 328777253 s0 freq -8900 path delay 54411
ptp4l[48374.911]: port 1: delay timeout
ptp4l[48374.912]: path delay 54730 55049
ptp4l[48375.419]: master offset 328777178 s0 freq -8900 path delay 54730
ptp4l[48375.518]: port 1: delay timeout
ptp4l[48375.519]: path delay 55049 64001
ptp4l[48376.419]: master offset 328775902 s0 freq -8900 path delay 55049
ptp4l[48376.912]: port 1: delay timeout
ptp4l[48376.912]: path delay 55455 55862
ptp4l[48377.419]: linreg: points 4 slope 1.000009562 intercept -328775458 err 0
ptp4l[48377.419]: master offset 328775472 s1 freq -9576 path delay 55455
ptp4l[48377.508]: port 1: delay timeout
ptp4l[48378.129]: port 1: delay timeout
ptp4l[48378.419]: linreg: points 4 slope 1.000009634 intercept 41 err 78
ptp4l[48378.419]: master offset 78 s2 freq -9675 path delay 55455
ptp4l[48378.419]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
phc2sys[48379.049]: port 002590.fffe.f355da-1 changed state
phc2sys[48379.049]: reconfiguring after port state change
phc2sys[48379.049]: selecting CLOCK_REALTIME for synchronization
phc2sys[48379.049]: selecting eth0 as the master clock
phc2sys[48379.049]: phc offset 328806192 s0 freq +0 delay 1516
ptp4l[48379.419]: linreg: points 4 slope 1.000009343 intercept -533 err 341
ptp4l[48379.419]: master offset 604 s2 freq -8810 path delay 55455
phc2sys[48380.049]: phc offset 328806161 s0 freq +0 delay 1516
ptp4l[48380.205]: port 1: delay timeout
ptp4l[48380.205]: path delay 55049 47086
ptp4l[48380.419]: linreg: points 4 slope 1.000009269 intercept -159 err 298
ptp4l[48380.419]: master offset 213 s2 freq -9110 path delay 55049
ptp4l[48380.637]: port 1: delay timeout
ptp4l[48380.638]: path delay 54730 43799
phc2sys[48381.049]: phc offset 328806265 s0 freq +0 delay 1517
ptp4l[48381.419]: linreg: points 4 slope 1.000009081 intercept -376 err 341
ptp4l[48381.419]: master offset 470 s2 freq -8705 path delay 54730
phc2sys[48382.049]: phc offset 328806509 s0 freq +0 delay 1515
ptp4l[48382.373]: port 1: delay timeout
ptp4l[48382.373]: path delay 55049 57392
ptp4l[48382.419]: linreg: points 4 slope 1.000008817 intercept -579 err 431
ptp4l[48382.419]: master offset 789 s2 freq -8238 path delay 55049
phc2sys[48383.049]: phc offset 328807191 s0 freq +0 delay 1517
ptp4l[48383.419]: linreg: points 4 slope 1.000008839 intercept 183 err 426
ptp4l[48383.419]: master offset -403 s2 freq -9023 path delay 55049
ptp4l[48383.578]: port 1: delay timeout
ptp4l[48383.578]: path delay 55455 60894
phc2sys[48384.050]: phc offset 328807550 s0 freq +0 delay 1505
ptp4l[48384.419]: linreg: points 4 slope 1.000008954 intercept 223 err 404
ptp4l[48384.419]: master offset -269 s2 freq -9177 path delay 55455
phc2sys[48385.050]: phc offset 328807545 s0 freq +0 delay 1515
ptp4l[48385.369]: port 1: delay timeout
ptp4l[48385.369]: path delay 55862 55986
ptp4l[48385.419]: linreg: points 4 slope 1.000009416 intercept 942 err 503
ptp4l[48385.419]: master offset -1199 s2 freq -10358 path delay 55862
ptp4l[48385.593]: port 1: delay timeout
ptp4l[48385.593]: path delay 55455 49211
ptp4l[48385.810]: port 1: delay timeout
ptp4l[48385.810]: path delay 55924 63105
phc2sys[48386.050]: phc offset 328806728 s0 freq +0 delay 1515
ptp4l[48386.419]: linreg: points 4 slope 1.000009022 intercept -976 err 608
ptp4l[48386.419]: master offset 1452 s2 freq -8046 path delay 55924
phc2sys[48387.050]: phc offset 328806932 s0 freq +0 delay 1517
ptp4l[48387.174]: port 1: delay timeout
ptp4l[48387.174]: path delay 55924 45407
ptp4l[48387.419]: linreg: points 4 slope 1.000008845 intercept -336 err 587
ptp4l[48387.419]: master offset 398 s2 freq -8509 path delay 55924
ptp4l[48387.670]: port 1: delay timeout
ptp4l[48387.671]: path delay 55924 60382
ptp4l[48387.683]: port 1: delay timeout
ptp4l[48387.684]: path delay 52608 49230
phc2sys[48388.050]: phc offset 328807681 s0 freq +0 delay 1516
ptp4l[48388.419]: linreg: points 4 slope 1.000007825 intercept -2085 err 629
ptp4l[48388.419]: master offset 2664 s2 freq -5740 path delay 52608
phc2sys[48389.051]: phc offset 328809985 s0 freq +0 delay 1536
ptp4l[48389.419]: linreg: points 4 slope 1.000007838 intercept 133 err 622
ptp4l[48389.419]: master offset -300 s2 freq -7971 path delay 52608
ptp4l[48389.423]: port 1: delay timeout
ptp4l[48389.424]: path delay 52608 47867
phc2sys[48390.051]: phc offset 328811953 s0 freq +0 delay 1518
ptp4l[48390.419]: linreg: points 4 slope 1.000008301 intercept 1323 err 653
ptp4l[48390.419]: master offset -2150 s2 freq -9623 path delay 52608
ptp4l[48390.894]: port 1: delay timeout
ptp4l[48390.894]: path delay 56592 57199
phc2sys[48391.051]: phc offset 328812074 s0 freq +0 delay 1515
ptp4l[48391.349]: port 1: delay timeout
ptp4l[48391.349]: path delay 52608 41147
ptp4l[48391.419]: linreg: points 4 slope 1.000008882 intercept 715 err 647
ptp4l[48391.419]: master offset -335 s2 freq -9597 path delay 52608
phc2sys[48392.051]: phc offset 328811597 s0 freq +0 delay 1517
ptp4l[48392.389]: port 1: delay timeout
ptp4l[48392.389]: path delay 49220 47224
ptp4l[48392.419]: linreg: points 4 slope 1.000007611 intercept -2922 err 716
ptp4l[48392.419]: master offset 4126 s2 freq -4689 path delay 49220
ptp4l[48392.568]: port 1: delay timeout
ptp4l[48392.568]: path delay 49220 53497
phc2sys[48393.051]: phc offset 328814195 s0 freq +0 delay 1516
ptp4l[48393.419]: linreg: points 4 slope 1.000007412 intercept 296 err 726
ptp4l[48393.419]: master offset -1236 s2 freq -7709 path delay 49220
phc2sys[48394.051]: phc offset 328816715 s0 freq +0 delay 1515
ptp4l[48394.419]: linreg: points 4 slope 1.000008064 intercept 1401 err 749
ptp4l[48394.419]: master offset -1872 s2 freq -9465 path delay 49220
ptp4l[48394.440]: port 1: delay timeout
ptp4l[48394.440]: path delay 50829 52428
phc2sys[48395.052]: phc offset 299887769 s0 freq +0 delay 1378
ptp4l[48395.362]: linreg: points 4 slope 1.000009908 intercept 3546 err 820
ptp4l[48395.362]: master offset -4256 s2 freq -13453 path delay 50829
ptp4l[48395.726]: port 1: delay timeout
ptp4l[48395.727]: path delay 50829 54273
phc2sys[48396.052]: phc offset 208967251 s0 freq +0 delay 1379
ptp4l[48396.279]: linreg: points 4 slope 1.000009460 intercept -1172 err 839
ptp4l[48396.279]: master offset 1810 s2 freq -8287 path delay 50829
ptp4l[48396.938]: port 1: delay timeout
ptp4l[48396.938]: path delay 52962 60679
phc2sys[48397.052]: phc offset 118050472 s0 freq +0 delay 1470
ptp4l[48397.195]: linreg: points 4 slope 1.000009995 intercept 1144 err 853
ptp4l[48397.195]: master offset -1520 s2 freq -11139 path delay 52962
phc2sys[48398.052]: phc offset 27131713 s0 freq +0 delay 1390
ptp4l[48398.112]: linreg: points 4 slope 1.000009589 intercept -618 err 846
ptp4l[48398.112]: master offset 531 s2 freq -8971 path delay 52962
ptp4l[48398.845]: port 1: delay timeout
ptp4l[48398.845]: path delay 52962 56980
phc2sys[48399.052]: phc offset -13300220 s0 freq +0 delay 1597
ptp4l[48399.077]: linreg: points 4 slope 1.000009968 intercept 375 err 830
ptp4l[48399.077]: master offset 9 s2 freq -10343 path delay 52962
ptp4l[48399.714]: port 1: delay timeout
ptp4l[48399.714]: path delay 53885 55412
ptp4l[48399.931]: port 1: delay timeout
ptp4l[48399.932]: path delay 54842 63268
phc2sys[48400.052]: phc offset -13301383 s0 freq +0 delay 1516
ptp4l[48400.077]: linreg: points 4 slope 1.000009308 intercept -1088 err 835
ptp4l[48400.077]: master offset 1070 s2 freq -8221 path delay 54842
ptp4l[48400.405]: port 1: delay timeout
ptp4l[48400.405]: path delay 54842 58364
phc2sys[48401.052]: phc offset -13300580 s0 freq +0 delay 1516
ptp4l[48401.077]: linreg: points 4 slope 1.000009393 intercept 145 err 821
ptp4l[48401.077]: master offset -150 s2 freq -9538 path delay 54842
phc2sys[48402.052]: phc offset -13301020 s0 freq +0 delay 1515
ptp4l[48402.077]: linreg: points 4 slope 1.000009213 intercept -398 err 815
ptp4l[48402.077]: master offset 546 s2 freq -8815 path delay 54842
ptp4l[48402.182]: port 1: delay timeout
ptp4l[48402.182]: path delay 54842 45838
ptp4l[48402.688]: port 1: delay timeout
ptp4l[48402.688]: path delay 54842 42398
phc2sys[48403.053]: phc offset -13300753 s0 freq +0 delay 1517
ptp4l[48403.077]: linreg: points 4 slope 1.000009176 intercept -40 err 799
ptp4l[48403.077]: master offset 6 s2 freq -9135 path delay 54842
ptp4l[48404.038]: port 1: delay timeout
ptp4l[48404.038]: path delay 54842 53071
phc2sys[48404.053]: phc offset -13300764 s0 freq +0 delay 1504
ptp4l[48404.078]: linreg: points 4 slope 1.000009149 intercept -11 err 784
ptp4l[48404.078]: master offset -39 s2 freq -9138 path delay 54842
phc2sys[48405.053]: phc offset -13300747 s0 freq +0 delay 1516
ptp4l[48405.078]: linreg: points 4 slope 1.000008982 intercept -468 err 783
ptp4l[48405.078]: master offset 753 s2 freq -8515 path delay 54842
ptp4l[48406.038]: port 1: delay timeout
ptp4l[48406.038]: path delay 56196 57368
phc2sys[48406.053]: phc offset -13300186 s0 freq +0 delay 1516
ptp4l[48406.078]: linreg: points 4 slope 1.000008225 intercept -1622 err 811
ptp4l[48406.078]: master offset 2160 s2 freq -6603 path delay 56196
phc2sys[48407.053]: phc offset -13297757 s0 freq +0 delay 1595
ptp4l[48407.078]: linreg: points 4 slope 1.000009064 intercept 2291 err 867
ptp4l[48407.078]: master offset -3629 s2 freq -11356 path delay 56196
ptp4l[48407.611]: port 1: delay timeout
ptp4l[48407.611]: path delay 56196 46776
phc2sys[48408.054]: phc offset -13299925 s0 freq +0 delay 1516
ptp4l[48408.078]: linreg: points 4 slope 1.000009548 intercept 671 err 859
ptp4l[48408.078]: master offset -469 s2 freq -10219 path delay 56196
phc2sys[48409.054]: phc offset -13301099 s0 freq +0 delay 1517
ptp4l[48409.078]: linreg: points 4 slope 1.000009940 intercept 417 err 843
ptp4l[48409.078]: master offset -63 s2 freq -10357 path delay 56196
ptp4l[48409.600]: port 1: delay timeout
ptp4l[48409.601]: path delay 54241 48536
phc2sys[48410.054]: phc offset -13318750 s0 freq +0 delay 1582
ptp4l[48410.078]: clockcheck: clock jumped forward or running faster than expected!
ptp4l[48410.078]: master offset 70368744180811 s0 freq -10357 path delay 54241
ptp4l[48410.078]: port 1: SLAVE to UNCALIBRATED on SYNCHRONIZATION_FAULT
phc2sys[48411.054]: port 002590.fffe.f355da-1 changed state
phc2sys[48411.054]: reconfiguring after port state change
phc2sys[48411.054]: master clock not ready, waiting...
ptp4l[48411.079]: master offset 70368744183134 s0 freq -10357 path delay 54241
ptp4l[48412.029]: port 1: delay timeout
ptp4l[48412.029]: path delay 51368 49666
ptp4l[48412.084]: master offset 70368744187846 s0 freq -10357 path delay 51368
ptp4l[48413.084]: linreg: points 4 slope 1.000007855 intercept -70368744188596 err 843
ptp4l[48413.084]: master offset 70368744187580 s2 freq +599999999 path delay 51368
ptp4l[48413.084]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
ptp4l[48413.596]: port 1: delay timeout
ptp4l[48413.596]: negative path delay -131938
ptp4l[48413.596]: path_delay = (t2 - t3) * rr + (t4 - t1) - (c1 + c2 + c3)
ptp4l[48413.596]: t2 - t3 = -205239411
ptp4l[48413.597]: t4 - t1 = +512838680
ptp4l[48413.597]: rr = 2.500019631
ptp4l[48413.597]: c1 0
ptp4l[48413.597]: c2 0
ptp4l[48413.597]: c3 0
ptp4l[48413.597]: path delay 49101 -131938
phc2sys[48414.055]: port 002590.fffe.f355da-1 changed state
phc2sys[48414.055]: reconfiguring after port state change
phc2sys[48414.055]: selecting CLOCK_REALTIME for synchronization
phc2sys[48414.055]: selecting eth0 as the master clock
phc2sys[48414.055]: phc offset -70368168772102 s0 freq +0 delay 1636
ptp4l[48414.084]: linreg: points 4 slope 0.999899345 intercept -70368144209416 err 362185
ptp4l[48414.084]: master offset 70368144318188 s2 freq +599999999 path delay 49101
ptp4l[48414.136]: port 1: delay timeout
ptp4l[48414.136]: path delay 47656 46209
ptp4l[48414.520]: port 1: delay timeout
ptp4l[48414.520]: path delay 47656 76564
phc2sys[48415.055]: phc offset -70367568670828 s0 freq +0 delay 1581
ptp4l[48415.084]: linreg: points 4 slope 0.999863205 intercept -70367544328976 err 181586
ptp4l[48415.084]: master offset 70367544293425 s2 freq +599999999 path delay 47656
ptp4l[48415.845]: port 1: delay timeout
ptp4l[48415.845]: path delay 49101 118776
phc2sys[48416.055]: phc offset -70366968524314 s0 freq +0 delay 1800
ptp4l[48416.084]: linreg: points 4 slope 0.999899195 intercept -70366944350012 err 181776
ptp4l[48416.084]: master offset 70366944276709 s2 freq +599999999 path delay 49101
phc2sys[48417.056]: phc offset -70366368366638 s0 freq +0 delay 1517
ptp4l[48417.084]: linreg: points 4 slope 1.000008752 intercept -70366344255069 err 182101
ptp4l[48417.084]: master offset 70366344254785 s2 freq +599999999 path delay 49101
ptp4l[48417.430]: port 1: delay timeout
ptp4l[48417.431]: path delay 51368 60901
phc2sys[48418.056]: phc offset -70365768242951 s0 freq +0 delay 1517
ptp4l[48418.084]: linreg: points 4 slope 1.000009541 intercept -70365744237176 err 145995
ptp4l[48418.084]: master offset 70365744237023 s2 freq +599999999 path delay 51368
ptp4l[48418.137]: port 1: delay timeout
ptp4l[48418.137]: path delay 49101 41680
ptp4l[48418.421]: port 1: delay timeout
ptp4l[48418.421]: path delay 48989 49442
phc2sys[48419.056]: phc offset -70365168123872 s0 freq +0 delay 1516
ptp4l[48419.084]: linreg: points 4 slope 1.000008598 intercept -70365144203149 err 122176
ptp4l[48419.084]: master offset 70365144204054 s2 freq +599999999 path delay 48989
ptp4l[48419.996]: port 1: delay timeout
ptp4l[48419.997]: path delay 48989 37993
phc2sys[48420.056]: phc offset -70364568028650 s0 freq +0 delay 1582
ptp4l[48420.084]: clockcheck: clock jumped forward or running faster than expected!
ptp4l[48420.084]: master offset 140733288370371 s0 freq +599999999 path delay 48989
ptp4l[48420.084]: port 1: SLAVE to UNCALIBRATED on SYNCHRONIZATION_FAULT
phc2sys[48421.056]: port 002590.fffe.f355da-1 changed state
phc2sys[48421.057]: reconfiguring after port state change
phc2sys[48421.057]: master clock not ready, waiting...
ptp4l[48421.084]: master offset 140732688350093 s0 freq +599999999 path delay 48989
ptp4l[48421.570]: port 1: delay timeout
ptp4l[48421.571]: path delay 49554 146073983
ptp4l[48422.084]: master offset 140732088332451 s0 freq +599999999 path delay 49554
ptp4l[48422.536]: port 1: delay timeout
ptp4l[48422.536]: path delay 55171 135723814
ptp4l[48423.084]: linreg: points 4 slope 1.000010546 intercept -140731488300783 err 122176
ptp4l[48423.084]: master offset 140731488299227 s2 freq +599999999 path delay 55171
ptp4l[48423.084]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
ptp4l[48423.093]: port 1: delay timeout
ptp4l[48423.093]: path delay 59812 58724
ptp4l[48423.502]: port 1: delay timeout
ptp4l[48423.502]: path delay 59812 43311
phc2sys[48424.057]: port 002590.fffe.f355da-1 changed state
phc2sys[48424.057]: reconfiguring after port state change
phc2sys[48424.057]: selecting CLOCK_REALTIME for synchronization
phc2sys[48424.057]: selecting eth0 as the master clock
phc2sys[48424.057]: phc offset -140730911647787 s0 freq +0 delay 1470
ptp4l[48424.084]: linreg: points 4 slope 1.000012236 intercept -140730888281503 err 3818
ptp4l[48424.084]: master offset 140730888280902 s2 freq +599999999 path delay 59812
ptp4l[48424.685]: port 1: delay timeout
ptp4l[48424.685]: path delay 57425 56126
phc2sys[48425.057]: phc offset -140730311546892 s0 freq +0 delay 1517
ptp4l[48425.084]: linreg: points 4 slope 1.000011115 intercept -140730288244723 err 4804
ptp4l[48425.084]: master offset 140730288247075 s2 freq +599999999 path delay 57425
phc2sys[48426.058]: phc offset -140729711417875 s0 freq +0 delay 1517
ptp4l[48426.084]: linreg: points 4 slope 1.000008793 intercept -140729688242615 err 4790
ptp4l[48426.084]: master offset 140729688243150 s2 freq +599999999 path delay 57425
ptp4l[48426.602]: port 1: delay timeout
ptp4l[48426.602]: path delay 56291 56456
ptp4l[48426.602]: port 1: delay timeout
ptp4l[48426.603]: path delay 56291 64187
phc2sys[48427.058]: phc offset -140729111582627 s0 freq +0 delay 1512
ptp4l[48427.083]: linreg: points 4 slope 1.000007751 intercept -140729088214989 err 3828
ptp4l[48427.083]: master offset 140729088214514 s2 freq +599999999 path delay 56291
ptp4l[48427.768]: port 1: delay timeout
ptp4l[48427.768]: path delay 57590 63474
phc2sys[48428.058]: phc offset -140728512483869 s0 freq +0 delay 1595
ptp4l[48428.081]: linreg: points 4 slope 1.000008704 intercept -140728488200748 err 3433
ptp4l[48428.081]: master offset 140728488200589 s2 freq +599999999 path delay 57590
phc2sys[48429.058]: phc offset -140727913371351 s0 freq +0 delay 1578
ptp4l[48429.079]: linreg: points 4 slope 1.000009008 intercept -140727888176495 err 3016
ptp4l[48429.079]: master offset 140727888176243 s2 freq +599999999 path delay 57590
^Cphc2sys[48429.590]: phc offset -140727594663556 s0 freq +0 delay 1577

Clearly bonkers now, so ^C.

Then I check dmesg, nothing there at all.

Any way to turn on some e1000e debugging?
Post by Gary E. Miller
You must not tell phc2sys to drive the system clock, then,
otherwise those two programs would fight each other. This means
you must not pass "-r" to phc2sys, that option tells phc2sys to
drive the system clock (please do read the phc2sys man page
before asking more questions about this, thanks).
I do not believe Jiri is right. I ran a similar config and it
appeared to work fine, without crazy clock jumping. Chronyd simply
took the SHM reference and tuned the system clock over time, because
the ntpshm servo presents itself to the ntp daemon.
I hope so. I'll believe it when I see it.
Post by Gary E. Miller
I tried removing the "-r". When I do that the ntpshm is no longer
written, thus cutting ntpshm out of the loop, making this useless.
You need the "-r" because otherwise it doesn't have a clock to
synchronize..
As you can see, I re-added the "-r" clock above.
Post by Gary E. Miller
Well, that conflicts with prior advice on this list. Are you
saying that I have no need to run phc2sys in hardware timestamp
mode? Does that mean I can run just ptp4l in ntpshm mode and
timestamp mode?
You must run phc2sys in hardware timestamping mode.
OK, I think that is what I'm doing. You can see above what I am doing.
Post by Gary E. Miller
I have also seen ptp4l software time suddenly gain large (300 mSec)
offsets that persist for hours. A program restart fixes that.
There is a lot of promise to this package, but a lot of work to do
as well.
If the remote clock jumps by 300ms, it is below the normal "jump" and
ptp4l servo attempts to tune for this by reducing frequency which
will take a very long time to catch up since most devices cannot
adjust the frequency very far. This is in order to provide a smooth
transition instead of an immediate jump.
My remote master clock has local PPS linked using chronyd, jitter around
200 nSec or better. Sometims I run my master with two local PPS for a
sanity check. My master chimer has a number of slaves also with local
PPS.

So not possible the master is anything but near perfect. Perfect being
defined as better than 1 uSec.

But we can leave the 300 mSec problem aside for now. chronyd can handle
that. It is the hardware clock going totally bonkers that is killing
me.

RGDS
GARY
---------------------------------------------------------------------------
Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97701
***@rellim.com Tel:+1(541)382-8588
Keller, Jacob E
2015-02-24 23:34:10 UTC
Permalink
-----Original Message-----
Sent: Tuesday, February 24, 2015 3:06 PM
To: Keller, Jacob E
Subject: Re: [Linuxptp-devel] ntp SHMs
Yo Jacob E!
On Tue, 24 Feb 2015 22:35:09 +0000
Hopefully you can provide the ptp4l results alone without phc2sys or
the chronyd interfering.
Sent, I think.
Like I mentioned use the testptp program
from the kernel Documentation to sanity check the device's clock.
If I can figure out how to make it.
Post by Gary E. Miller
kong ~ # ptp4l -i eth0 -l 7 -m -f ptp.conf &
I recommend that you run with -l 6 ,since l7 prints a ton of debug
information that is nearly always not helpful and clutters reading
the time log output.
Changed.
Post by Gary E. Miller
delay 54241 48536 phc2sys[48410.054]: phc offset
clockcheck: clock jumped forward or running faster than expected!
And at this point, the clock jumped. *something* twiddled with the
clock.
Ptp4l only cares about the hardware MAC clock :)
Which clock? The system clock was stable at that time.
The MAC clock that ptp4l was synchronizing jumped.
The "what" I am not sure. Also it seems weird that the phc
switched from a positive to a negative offset..?
Yup, weird.
I assume you put the
NTP SHM into chronyd as a refclock?
Yup. That way I can compare it to other clocks. I have a 'watch chronyc
sources" going, I can see the phc2sys clock goes nuts then it takes a while
before chronyd takes notice. Usually chronyd marks it a falsechimer,
but occasionally follows it to crazy town for a while.
Post by Gary E. Miller
ptp4l[48410.078]: master offset 70368744180811 s0 freq -10357 path delay 54241
ptp4l[48410.078]: port 1: SLAVE to UNCALIBRATED on
SYNCHRONIZATION_FAULT
At this point ptp4l has reset and now things get funky. I suspect
this is either a driver bug, or somehow you have other things
controlling the clock. Obviously both of us are stumped on what else
could be doing it...
Again, most of the time I assumed it would be clear which clock I was referencing due to what portion of the process I was talking about. In this case, the MAC clock has gone to crazy town.
By 'the clock' you mean which clock? The system clock is not moving.
So you mean the phc clock?
Post by Gary E. Miller
Clearly bonkers now, so ^C.
Yes its bonkers now because of whatever that clock jump invalid event
was that very much screwed all the settings.
Well, at least tow of us see it.
Post by Gary E. Miller
OK, I think that is what I'm doing. You can see above what I am doing.
Just to clarify, without phc2sys then you only synchronize the MAC
hardware clock and not the real system clock.
Right, and confusing. In timestamp software mode ptp4l steers the NTP
SHM (and then possibly, by way of chronyd, the system clock), but in
timestamp hardware mode phc2sys steers the NTP SHM.
RGDS
GARY
---------------------------------------------------------------------------
Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97701
Jiri Benc
2015-02-25 14:53:26 UTC
Permalink
I do not believe Jiri is right. I ran a similar config and it appeared
to work fine, without crazy clock jumping. Chronyd simply took the SHM
reference and tuned the system clock over time, because the ntpshm
servo presents itself to the ntp daemon.
You're right and I'm not. The ntpshm servo always sets the
SERVO_UNLOCKED state in the sample() callback, thus it never sets any
clock. I didn't know that and I dislike that very much. This is a gross
hack. Not mentioning it's not documented in the man page.

Miroslav, any chance to improve this to be better understandable to
users? From the user point of view, the shm is just another time
source. In the code, it could be implemented as a fake clock (as you
need at least two clocks for phc2sys to do anything) driven by this
special servo. Requiring the user to add a random second clock for this
to work (be it a system clock or a different PHC) is very confusing.

This still won't allow things like a two-PHC boundary clock with NTP
synchronization. For this, we'll need to be able to specify per-clock
servos. The ntpshm servo then will drive only the fake clock.

Jiri
--
Jiri Benc
Miroslav Lichvar
2015-02-25 15:15:31 UTC
Permalink
Post by Jiri Benc
I do not believe Jiri is right. I ran a similar config and it appeared
to work fine, without crazy clock jumping. Chronyd simply took the SHM
reference and tuned the system clock over time, because the ntpshm
servo presents itself to the ntp daemon.
You're right and I'm not. The ntpshm servo always sets the
SERVO_UNLOCKED state in the sample() callback, thus it never sets any
clock. I didn't know that and I dislike that very much. This is a gross
hack. Not mentioning it's not documented in the man page.
The man page says the clock is synchronized by another process.
Post by Jiri Benc
Miroslav, any chance to improve this to be better understandable to
users?
Possibly, any suggestions?
Post by Jiri Benc
From the user point of view, the shm is just another time
source. In the code, it could be implemented as a fake clock (as you
need at least two clocks for phc2sys to do anything) driven by this
special servo.
I don't follow. What would the fake clock do?
Post by Jiri Benc
Requiring the user to add a random second clock for this
to work (be it a system clock or a different PHC) is very confusing.
It's not a random clock, it's the clock that is synchronized by the
SHM consumer, i.e. the system clock with chronyd/ntpd. SHM samples
contain timestamps from both clocks (PHC and system clock).
--
Miroslav Lichvar
Jiri Benc
2015-02-25 15:32:00 UTC
Permalink
[Dropping Gary and Jacob from CC, this is not related to the original
topic anymore.]
Post by Miroslav Lichvar
The man page says the clock is synchronized by another process.
It's not clear what that means, though. I think the behavior of ntpshm
is so special that it deserves a special chapter in the man page
describing how it works, how to use it and that specifying clock
sources (including the system clock and the -r parameter) has different
meaning with this servo. I mean, even I am very confused by this thing,
what about poor users ;-)
Post by Miroslav Lichvar
Post by Jiri Benc
From the user point of view, the shm is just another time
source. In the code, it could be implemented as a fake clock (as you
need at least two clocks for phc2sys to do anything) driven by this
special servo.
I don't follow. What would the fake clock do?
Post by Jiri Benc
Requiring the user to add a random second clock for this
to work (be it a system clock or a different PHC) is very confusing.
It's not a random clock, it's the clock that is synchronized by the
SHM consumer, i.e. the system clock with chronyd/ntpd. SHM samples
contain timestamps from both clocks (PHC and system clock).
Hmm, I see. The fake clock would not help, then. I still don't like it
but I don't see a better way to do this, currently. We definitely need
a better documentation.

Thanks for the explanation,

Jiri
--
Jiri Benc
Keller, Jacob E
2015-02-24 21:41:52 UTC
Permalink
Hey,
-----Original Message-----
Sent: Monday, February 23, 2015 6:49 PM
To: Keller, Jacob E
Subject: Re: [Linuxptp-devel] ntp SHMs
Yo Jacob!
Just to summarize what I just tried, that failed. I repeated several times,
similar results, it just took varying times before going crazy, usually
10 to 90 seconds.
kong ~ # ethtool -T eth0
hardware-transmit (SOF_TIMESTAMPING_TX_HARDWARE)
software-transmit (SOF_TIMESTAMPING_TX_SOFTWARE)
hardware-receive (SOF_TIMESTAMPING_RX_HARDWARE)
software-receive (SOF_TIMESTAMPING_RX_SOFTWARE)
software-system-clock (SOF_TIMESTAMPING_SOFTWARE)
hardware-raw-clock (SOF_TIMESTAMPING_RAW_HARDWARE)
PTP Hardware Clock: 0
off (HWTSTAMP_TX_OFF)
on (HWTSTAMP_TX_ON)
none (HWTSTAMP_FILTER_NONE)
all (HWTSTAMP_FILTER_ALL)
ptpv1-l4-sync (HWTSTAMP_FILTER_PTP_V1_L4_SYNC)
ptpv1-l4-delay-req
(HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ)
ptpv2-l4-sync (HWTSTAMP_FILTER_PTP_V2_L4_SYNC)
ptpv2-l4-delay-req
(HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ)
ptpv2-l2-sync (HWTSTAMP_FILTER_PTP_V2_L2_SYNC)
ptpv2-l2-delay-req
(HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ)
ptpv2-event (HWTSTAMP_FILTER_PTP_V2_EVENT)
ptpv2-sync (HWTSTAMP_FILTER_PTP_V2_SYNC)
ptpv2-delay-req (HWTSTAMP_FILTER_PTP_V2_DELAY_REQ)
kong ~ # killall ptp4l phc2sys
ptp4l: no process found
phc2sys: no process found
kong ~ # killall ptp4l phc2sys
ptp4l: no process found
phc2sys: no process found
kong ~ # cat ptp.conf
[global]
clock_servo linreg
kong ~ # ptp4l -i eth0 -l 7 -m -f ptp.conf &
kong ~ # phc2sys -a -r -E ntpshm -m -M 2
phc2sys[354.145]: uds: sendto failed: No such file or directory
Apparently it recovers, because you seem to have it working. The default should be fine. I am going to assume this is a transient problem due to timing between when you start ptp4l and phc2sys, where the us address isn't up yet.
This one is odd, is uds_address not defaulted as documented?
Sadly, add uds_address /var/run/ptp4l to my ptp.conf does not change
anything.
ptp4l[354.146]: selected /dev/ptp0 as PTP clock
ptp4l[354.183]: port 1: INITIALIZING to LISTENING on INITIALIZE
ptp4l[354.183]: port 0: INITIALIZING to LISTENING on INITIALIZE
ptp4l[354.570]: port 1: setting asCapable
phc2sys[355.146]: Waiting for ptp4l...
ptp4l[355.197]: port 0: setting asCapable
ptp4l[355.674]: port 1: new foreign master 003048.fffe.345fe2-1
phc2sys[356.198]: reconfiguring after port state change
phc2sys[356.198]: selecting eth0 for synchronization
phc2sys[356.198]: nothing to synchronize
ptp4l[359.341]: selected best master clock 003048.fffe.345fe2
ptp4l[359.341]: foreign master not using PTP timescale
ptp4l[359.341]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE
ptp4l[359.361]: port 1: delay timeout
ptp4l[359.526]: port 1: delay timeout
phc2sys[360.198]: port 002590.fffe.f355da-1 changed state
phc2sys[360.198]: reconfiguring after port state change
phc2sys[360.198]: master clock not ready, waiting...
ptp4l[360.352]: port 1: delay timeout
ptp4l[360.353]: path delay 58263 58263
ptp4l[360.987]: master offset 16506354212 s0 freq -0 path delay
58263
ptp4l[361.746]: port 1: delay timeout
ptp4l[361.746]: path delay 59135 60008
ptp4l[361.904]: master offset 16506344102 s0 freq -0 path delay
59135
ptp4l[362.820]: master offset 16506334623 s0 freq -0 path delay
59135
ptp4l[363.681]: port 1: delay timeout
ptp4l[363.681]: path delay 58263 47033
ptp4l[363.737]: linreg: points 4 slope 1.000008825 intercept -
16506326986 err 0
ptp4l[363.737]: master offset 16506327955 s1 freq -9794 path delay
58263
ptp4l[364.654]: linreg: points 4 slope 1.000008873 intercept 613 err 1412
ptp4l[364.654]: master offset -1412 s2 freq -9485 path delay 58263
ptp4l[364.654]: port 1: UNCALIBRATED to SLAVE on
MASTER_CLOCK_SELECTED
phc2sys[365.199]: port 002590.fffe.f355da-1 changed state
phc2sys[365.199]: reconfiguring after port state change
phc2sys[365.199]: selecting CLOCK_REALTIME for synchronization
phc2sys[365.199]: selecting eth0 as the master clock
phc2sys[365.199]: phc offset -70353239245525 s0 freq +0 delay 1348
WTF was that???
This is telling you that your clock time (CLOCK_REALTIME) is off by 70,000 seconds. That's not happy.
ptp4l[365.571]: clockcheck: clock jumped forward or running faster than expected!
ptp4l[365.571]: master offset 70368744176888 s0 freq -9485 path delay
58263
ptp4l[365.571]: port 1: SLAVE to UNCALIBRATED on
SYNCHRONIZATION_FAULT
ptp4l[366.164]: port 1: delay timeout
ptp4l[366.164]: path delay 58096 57929
phc2sys[366.199]: port 002590.fffe.f355da-1 changed state
phc2sys[366.199]: reconfiguring after port state change
phc2sys[366.199]: master clock not ready, waiting...
ptp4l[366.487]: master offset 70368744178475 s0 freq -9485 path delay
58096
ptp4l[367.404]: master offset 70368744179257 s0 freq -9485 path delay
58096
ptp4l[367.495]: port 1: delay timeout
ptp4l[367.495]: path delay 58263 61062
ptp4l[368.321]: linreg: points 4 slope 1.000008584 intercept -
70368744179915 err 1412
ptp4l[368.321]: master offset 70368744179632 s2 freq +599999999 path delay 58263
ptp4l[368.321]: port 1: UNCALIBRATED to SLAVE on
MASTER_CLOCK_SELECTED
ptp4l[368.810]: port 1: delay timeout
ptp4l[368.810]: negative path delay -69667
ptp4l[368.810]: path_delay = (t2 - t3) * rr + (t4 - t1) - (c1 + c2 + c3)
ptp4l[368.810]: t2 - t3 = -213724133
ptp4l[368.810]: t4 - t1 = +534175583
ptp4l[368.810]: rr = 2.500021454
ptp4l[368.810]: c1 0
ptp4l[368.810]: c2 0
ptp4l[368.810]: c3 0
ptp4l[368.810]: path delay 58096 -69667
ptp4l[369.140]: port 1: delay timeout
ptp4l[369.140]: negative path delay -60229
ptp4l[369.140]: path_delay = (t2 - t3) * rr + (t4 - t1) - (c1 + c2 + c3)
ptp4l[369.140]: t2 - t3 = -357590119
ptp4l[369.140]: t4 - t1 = +893862511
ptp4l[369.140]: rr = 2.500021454
ptp4l[369.140]: c1 0
ptp4l[369.140]: c2 0
ptp4l[369.140]: c3 0
ptp4l[369.140]: path delay 57929 -60229
phc2sys[369.199]: port 002590.fffe.f355da-1 changed state
phc2sys[369.199]: reconfiguring after port state change
phc2sys[369.199]: selecting CLOCK_REALTIME for synchronization
phc2sys[369.199]: selecting eth0 as the master clock
phc2sys[369.199]: phc offset -70353027879345 s0 freq +0 delay 1391
ptp4l[369.237]: linreg: points 4 slope 0.999941194 intercept -
70368144182166 err 225182
ptp4l[369.237]: master offset 70368144249873 s2 freq +599999999 path delay 57929
ptp4l[370.073]: port 1: delay timeout
ptp4l[370.073]: path delay 58096 71844
ptp4l[370.154]: linreg: points 4 slope 0.999918331 intercept -
70367544242678 err 113137
ptp4l[370.154]: master offset 70367544220469 s2 freq +599999999 path delay 58096
phc2sys[370.199]: phc offset -70352464198801 s0 freq +0 delay 1345
ptp4l[371.071]: linreg: points 4 slope 0.999940831 intercept -
70366944235752 err 113128
ptp4l[371.071]: master offset 70366944190386 s2 freq +599999999 path delay 58096
phc2sys[371.199]: phc offset -70351900518690 s0 freq +0 delay 1379
ptp4l[371.706]: port 1: delay timeout
ptp4l[371.706]: path delay 58263 80317
ptp4l[371.987]: linreg: points 4 slope 1.000008261 intercept -
70366344169510 err 112836
ptp4l[371.987]: master offset 70366344169768 s2 freq +599999999 path delay 58263
phc2sys[372.200]: phc offset -70351336831824 s0 freq +0 delay 1347
ptp4l[372.904]: linreg: points 4 slope 1.000008717 intercept -
70365744150943 err 90551
ptp4l[372.904]: master offset 70365744150552 s2 freq +599999999 path delay 58263
phc2sys[373.200]: phc offset -70350773141966 s0 freq +0 delay 1388
ptp4l[373.617]: port 1: delay timeout
ptp4l[373.617]: path delay 58103 57944
ptp4l[373.821]: linreg: points 4 slope 1.000008674 intercept -
70365144126319 err 75472
ptp4l[373.821]: master offset 70365144126321 s2 freq +599999999 path delay 58103
phc2sys[374.200]: phc offset -70350209474225 s0 freq +0 delay 1348
ptp4l[374.738]: linreg: points 4 slope 1.000008812 intercept -
70364544099292 err 64723
ptp4l[374.738]: master offset 70364544099571 s2 freq +599999999 path delay 58103
ptp4l[374.831]: port 1: delay timeout
ptp4l[374.831]: path delay 58976 62467
phc2sys[375.200]: phc offset -70349645810969 s0 freq +0 delay 1347
ptp4l[375.654]: linreg: points 4 slope 1.000008650 intercept -
70363944062902 err 56663
ptp4l[375.654]: master offset 70363944062598 s2 freq +599999999 path delay 58976
ptp4l[375.923]: port 1: delay timeout
ptp4l[375.923]: path delay 57936 53829
phc2sys[376.200]: phc offset -70349082116188 s0 freq +0 delay 1460
ptp4l[376.571]: linreg: points 4 slope 1.000008152 intercept -
70363344039615 err 50595
ptp4l[376.571]: master offset 70363344040348 s2 freq +599999999 path delay 57936
phc2sys[377.200]: phc offset -70348518427659 s0 freq +0 delay 1343
ptp4l[377.488]: linreg: points 4 slope 1.000007828 intercept -
70362744024915 err 45586
ptp4l[377.488]: master offset 70362744024897 s2 freq +599999999 path delay 57936
ptp4l[377.807]: port 1: delay timeout
ptp4l[377.807]: path delay 59445 60946
phc2sys[378.200]: phc offset -70347954751340 s0 freq +0 delay 1353
ptp4l[378.404]: linreg: points 4 slope 1.000008160 intercept -
70362143984809 err 44713
ptp4l[378.404]: master offset 70362143983983 s2 freq +599999999 path delay 59445
phc2sys[379.200]: phc offset -70347391089799 s0 freq +0 delay 1388
ptp4l[379.321]: linreg: points 4 slope 1.000009185 intercept -
70361543971115 err 43857
ptp4l[379.321]: master offset 70361543970987 s2 freq +599999999 path delay 59445
ptp4l[379.707]: port 1: delay timeout
ptp4l[379.707]: path delay 59445 42090
phc2sys[380.200]: phc offset -70346827428315 s0 freq +0 delay 1462
ptp4l[380.238]: linreg: points 4 slope 1.000009091 intercept -
70360943944337 err 42997
ptp4l[380.238]: master offset 70360943944743 s2 freq +599999999 path delay 59445
ptp4l[380.465]: port 1: delay timeout
ptp4l[380.465]: path delay 55886 46490
ptp4l[381.155]: linreg: points 8 slope 1.000008246 intercept -
70360343906082 err 33333
ptp4l[381.155]: master offset 70360343908136 s2 freq +599999999 path delay 55886
phc2sys[381.200]: phc offset -70346263731002 s0 freq +0 delay 1463
ptp4l[381.758]: port 1: delay timeout
ptp4l[381.759]: path delay 55886 45742
ptp4l[382.071]: linreg: points 8 slope 1.000008035 intercept -
70359743870239 err 32710
ptp4l[382.071]: master offset 70359743871471 s2 freq +599999999 path delay 55886
phc2sys[382.200]: phc offset -70345700043849 s0 freq +0 delay 1464
ptp4l[382.451]: port 1: delay timeout
ptp4l[382.452]: path delay 56747 55551
ptp4l[382.988]: linreg: points 8 slope 1.000008136 intercept -
70359143861734 err 32071
ptp4l[382.988]: master offset 70359143861383 s2 freq +599999999 path delay 56747
phc2sys[383.200]: phc offset -70345136355504 s0 freq +0 delay 1461
ptp4l[383.905]: linreg: points 8 slope 1.000008152 intercept -
70358543843295 err 31462
ptp4l[383.905]: master offset 70358543842094 s2 freq +599999999 path delay 56747
ptp4l[384.188]: port 1: delay timeout
ptp4l[384.188]: path delay 54690 39523
phc2sys[384.200]: phc offset -70344572667058 s0 freq +0 delay 1350
ptp4l[384.313]: port 1: delay timeout
ptp4l[384.314]: path delay 53598 53368
ptp4l[384.821]: linreg: points 8 slope 1.000007880 intercept -
70357943816603 err 30869
ptp4l[384.821]: master offset 70357943817385 s2 freq +599999999 path delay 53598
ptp4l[385.156]: port 1: delay timeout
ptp4l[385.156]: path delay 53598 57616
phc2sys[385.200]: phc offset -70344008980381 s0 freq +0 delay 1464
ptp4l[385.738]: linreg: points 8 slope 1.000007783 intercept -
70357343791409 err 30264
ptp4l[385.738]: master offset 70357343791685 s2 freq +599999999 path delay 53598
phc2sys[386.201]: phc offset -70343445298408 s0 freq +0 delay 1466
ptp4l[386.655]: linreg: points 8 slope 1.000007932 intercept -
70356743766393 err 29678
ptp4l[386.655]: master offset 70356743766013 s2 freq +599999999 path delay 53598
ptp4l[386.967]: port 1: delay timeout
ptp4l[386.968]: path delay 51744 50120
phc2sys[387.201]: phc offset -70342881625905 s0 freq +0 delay 1352
ptp4l[387.571]: linreg: points 8 slope 1.000007910 intercept -
70356143740286 err 29130
ptp4l[387.571]: master offset 70356143742033 s2 freq +599999999 path delay 51744
I stopped it here as it tried to step my good clock by -70356s.
So this looks quit suspicious as a driver bug. Especially as it appears to be toggling between a large positive and negative value I am recalling something similar.

What happens if you just run ptp4l without running phc2sys and without running chronyd etc? I want to see ptp4l running stable with hardware timestamping before adding extra messages. Also what is your dmesg output here, so that we can see if there is any kernel message related to this.

Regards,
Jake
RGDS
GARY
---------------------------------------------------------------------------
Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97701
Gary E. Miller
2015-02-24 22:24:48 UTC
Permalink
Yo Jacob E!

On Tue, 24 Feb 2015 21:41:52 +0000
Post by Keller, Jacob E
Post by Gary E. Miller
kong ~ # phc2sys -a -r -E ntpshm -m -M 2
phc2sys[354.145]: uds: sendto failed: No such file or directory
Apparently it recovers, because you seem to have it working. The
default should be fine. I am going to assume this is a transient
problem due to timing between when you start ptp4l and phc2sys, where
the us address isn't up yet.
Confirmed. Adding a 'sleep 3' to my test script fixes it. I would
suggest the error message make some mention of what it was trying
to send where for debug puposes.
Post by Keller, Jacob E
Post by Gary E. Miller
phc2sys[365.199]: phc offset -70353239245525 s0 freq +0 delay 1348
WTF was that???
This is telling you that your clock time (CLOCK_REALTIME) is off by
70,000 seconds. That's not happy.
Except my clock time is off by about 60 uSec. At least until phc2sys
tries to 'fix' the problem.
Post by Keller, Jacob E
Post by Gary E. Miller
I stopped it here as it tried to step my good clock by -70356s.
So this looks quit suspicious as a driver bug. Especially as it
appears to be toggling between a large positive and negative value I
am recalling something similar.
Ouch. I am on an Intel I217-LM with the e1000e driver.

Maybe time for me to try another NIC.
Post by Keller, Jacob E
What happens if you just run ptp4l without running phc2sys and
without running chronyd etc? I want to see ptp4l running stable with
hardware timestamping before adding extra messages.
Got a suggested config and procedure for that?

Since in the present config my ptp4l is in linreg mode, is it just a
matter of starting ptp4l in linreg mode, not runnning phc2sys, and
looking at the ptp4l debug output?
Post by Keller, Jacob E
Also what is your
dmesg output here, so that we can see if there is any kernel message
related to this.
Nothing at all in dmesg.

RGDS
GARY
---------------------------------------------------------------------------
Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97701
***@rellim.com Tel:+1(541)382-8588
Keller, Jacob E
2015-02-24 23:31:37 UTC
Permalink
-----Original Message-----
Sent: Tuesday, February 24, 2015 2:55 PM
To: Keller, Jacob E
Subject: Re: [Linuxptp-devel] ntp SHMs
Yo Jacob E!
On Tue, 24 Feb 2015 22:32:40 +0000
Post by Gary E. Miller
Post by Keller, Jacob E
This is telling you that your clock time (CLOCK_REALTIME) is off
by 70,000 seconds. That's not happy.
Except my clock time is off by about 60 uSec. At least until
phc2sys tries to 'fix' the problem.
No. That is a mis-understanding. Your internal NIC clock is off by 60
uSec.
No misunderstanding. My slave is NTP connected, so I KNOW the
system clock is off by 60 uSec from the clock master.
Your REALTIME clock is not the NIC clock, but actually the
kernel time,
Never thought otherwise.
and *that* is what phc2sys offset is for.
Hmm, I see the confusion, I never mentioned my phc2sys offest. Until
it goes to 70,000 seconds....
Correct. I was not assuming that you had another way to measure system time offset from the master.
Post by Gary E. Miller
Post by Keller, Jacob E
So this looks quit suspicious as a driver bug. Especially as it
appears to be toggling between a large positive and negative
value I am recalling something similar.
Ouch. I am on an Intel I217-LM with the e1000e driver.
Ya. I looked at the driver for this, but nothing was obvious. The
issue I thought might be related appears to be fine in the 3.19
kernel.
Another dead end...
Post by Gary E. Miller
Got a suggested config and procedure for that?
Please just run your setup without running phc2sys, and use -l6 on
ptp4l instead of -l7.
[global]
clock_servo linreg
uds_address /var/run/ptp4l
# killall ptp4l phc2sys
# killall ptp4l phc2sys
# ptp4l -i eth0 -l 6 -m -f ptp.conf &
See if you get the same clock jump issue or not.
Since my ptp4l is in linreg mode, it can not jump my system clock.
To be clear, linreg is just a linear regression servo which could control the system clock, if you were doing software timestamping. In this case linreg is controlling the hardware MAC clock.
So what should I look for?
This doesn't look ugly, you just likely mis-understand the output.
# killall ptp4l phc2sys
ptp4l: no process found
phc2sys: no process found
# ptp4l -i eth0 -l 6 -m -f ptp.conf
ptp4l[52360.386]: selected /dev/ptp0 as PTP clock
ptp4l[52360.386]: port 1: INITIALIZING to LISTENING on INITIALIZE
ptp4l[52360.386]: port 0: INITIALIZING to LISTENING on INITIALIZE
ptp4l[52362.144]: port 1: new foreign master 003048.fffe.345fe2-1
ptp4l[52366.144]: selected best master clock 003048.fffe.345fe2
ptp4l[52366.144]: foreign master not using PTP timescale
ptp4l[52366.144]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE
So it starts in uncalibrated mode, to prime the servo.
ptp4l[52367.211]: master offset -99132 s0 freq +599999998 path delay
46363
ptp4l[52368.211]: master offset -107849 s0 freq +599999998 path delay
46363
ptp4l[52369.211]: master offset -125869 s0 freq +599999998 path delay
55328
ptp4l[52370.211]: master offset -134962 s1 freq +599994160 path delay
55306
ptp4l[52371.211]: master offset -599870879 s2 freq +186802553 path delay
55306
ptp4l[52371.211]: port 1: UNCALIBRATED to SLAVE on
MASTER_CLOCK_SELECTED
ptp4l[52372.211]: master offset -786797418 s2 freq -149872567 path delay
55306
ptp4l[52373.211]: master offset -637009628 s2 freq -599999999 path delay
58203
ptp4l[52374.211]: master offset -37092094 s2 freq -37208915 path delay
60342
ptp4l[52375.211]: master offset 244012 s2 freq +177926 path delay
58203
You didn't let it run long enough to stabilize but restarting is probably fine.
kong linuxptp # ptp4l -i eth0 -l 6 -m -f ptp.conf
ptp4l[52498.377]: selected /dev/ptp0 as PTP clock
ptp4l[52498.378]: port 1: INITIALIZING to LISTENING on INITIALIZE
ptp4l[52498.378]: port 0: INITIALIZING to LISTENING on INITIALIZE
ptp4l[52500.147]: port 1: new foreign master 003048.fffe.345fe2-1
ptp4l[52504.147]: selected best master clock 003048.fffe.345fe2
ptp4l[52504.147]: foreign master not using PTP timescale
ptp4l[52504.147]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE
ptp4l[52506.215]: master offset -125996 s0 freq +177925 path delay
66149
ptp4l[52507.215]: master offset -114049 s0 freq +177925 path delay
46374
ptp4l[52508.215]: master offset -126964 s0 freq +177925 path delay
49315
ptp4l[52509.215]: master offset -136826 s1 freq +177442 path delay
52187
ptp4l[52510.215]: master offset -188068 s2 freq -21513 path delay
52187
ptp4l[52510.215]: port 1: UNCALIBRATED to SLAVE on
MASTER_CLOCK_SELECTED
ptp4l[52511.216]: master offset -176435 s2 freq -97696 path delay
52187
ptp4l[52512.215]: master offset -89938 s2 freq -100192 path delay
55059
ptp4l[52513.216]: master offset 1079 s2 freq -9460 path delay 54937
ptp4l[52514.215]: master offset 1241 s2 freq -8949 path delay 54937
ptp4l[52515.216]: master offset 1141 s2 freq -8099 path delay 54815
ptp4l[52516.216]: master offset 1815 s2 freq -7358 path delay 54242
ptp4l[52517.216]: master offset 2330 s2 freq -5907 path delay 51492
ptp4l[52518.216]: master offset -102 s2 freq -7236 path delay 51492
ptp4l[52519.216]: master offset -2551 s2 freq -10142 path delay
52478
ptp4l[52520.216]: master offset -1582 s2 freq -10594 path delay
53035
ptp4l[52521.216]: master offset -90 s2 freq -9519 path delay 53035
ptp4l[52522.216]: master offset 543 s2 freq -8594 path delay 53035
ptp4l[52523.216]: master offset -958 s2 freq -8997 path delay 54242
ptp4l[52524.216]: master offset -656 s2 freq -9885 path delay 54242
ptp4l[52525.216]: master offset -735 s2 freq -9723 path delay 54242
ptp4l[52526.216]: master offset 2322 s2 freq -8113 path delay 53035
ptp4l[52527.216]: master offset 1049 s2 freq -8521 path delay 53035
ptp4l[52528.216]: master offset 575 s2 freq -8585 path delay 53035
ptp4l[52529.216]: master offset 2479 s2 freq -7484 path delay 51843
ptp4l[52530.216]: master offset -655 s2 freq -8534 path delay 53237
ptp4l[52531.216]: master offset -3311 s2 freq -10634 path delay
54517
ptp4l[52532.216]: master offset -1199 s2 freq -9462 path delay 54517
ptp4l[52533.216]: master offset -542 s2 freq -9120 path delay 55101
ptp4l[52534.216]: master offset 529 s2 freq -8762 path delay 54517
ptp4l[52535.216]: master offset 2668 s2 freq -8258 path delay 51809
ptp4l[52536.216]: master offset 4866 s2 freq -7613 path delay 48985
ptp4l[52537.216]: master offset 2497 s2 freq -8015 path delay 48985
ptp4l[52538.216]: master offset 3183 s2 freq -7700 path delay 48012
ptp4l[52539.216]: master offset 3050 s2 freq -7731 path delay 48012
ptp4l[52540.216]: master offset -2442 s2 freq -8956 path delay 51809
ptp4l[52541.216]: master offset -5123 s2 freq -9862 path delay 53957
ptp4l[52542.216]: master offset -3089 s2 freq -9278 path delay 53957
ptp4l[52543.217]: master offset -1177 s2 freq -8960 path delay 52186
ptp4l[52544.216]: master offset -1132 s2 freq -9018 path delay 52186
ptp4l[52545.217]: master offset -1822 s2 freq -8931 path delay 52186
ptp4l[52546.217]: master offset 2700 s2 freq -8034 path delay 48389
ptp4l[52547.217]: master offset 39 s2 freq -9219 path delay 49659
ptp4l[52548.217]: master offset 276 s2 freq -8977 path delay 49659
ptp4l[52549.217]: master offset -2340 s2 freq -8971 path delay 52804
ptp4l[52550.217]: master offset -2604 s2 freq -8968 path delay 52804
ptp4l[52551.217]: master offset -2020 s2 freq -8958 path delay 52804
ptp4l[52552.217]: master offset 304 s2 freq -8816 path delay 49659
ptp4l[52553.217]: master offset 1610 s2 freq -8671 path delay 48560
ptp4l[52554.217]: master offset 3023 s2 freq -8503 path delay 47050
ptp4l[52555.217]: master offset 4238 s2 freq -8378 path delay 45505
It is possible the PLL is going nuts on startup?
It's probably due to starting in the weird state you did. If you let it run long enough it should stabilize.

If you leave this running do you *ever* see "clock check" errors like we were seeing before with both phc2sys and ptp4l?
Also if you have the kernel documentation installed, you can use
Documentation/ptp/testptp.c in order to read the PTP hardware clock
directly.
Try building that and running
kong ptp # make testptp
cc testptp.c -o testptp
testptp.c:(.text+0xa78): undefined reference to `timer_create'
testptp.c:(.text+0xad8): undefined reference to `timer_settime'
testptp.c:(.text+0xb04): undefined reference to `timer_delete'
testptp.c:(.text+0xb50): undefined reference to `timer_create'
testptp.c:(.text+0xbbf): undefined reference to `timer_settime'
collect2: error: ld returned 1 exit status
<builtin>: recipe for target 'testptp' failed
make: *** [testptp] Error 1
Ideas?
Try building it like "make M=Documentation/ptp/" from the top level.
./testptp -g -d /dev/ptpX
Where ptpX is your device's ptp device, which you can see via ethtool
-T
You can use that to sanity check whether the PTP device is being set
correctly. It should match your remote PTP master's time.
If I can build it...
RGDS
GARY
---------------------------------------------------------------------------
Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97701
Gary E. Miller
2015-02-26 01:34:44 UTC
Permalink
Yo Jacob E!

On Tue, 24 Feb 2015 23:31:37 +0000
Post by Keller, Jacob E
Hmm, I see the confusion, I never mentioned my phc2sys offest.
Until it goes to 70,000 seconds....
Correct. I was not assuming that you had another way to measure
system time offset from the master.
Not much of a test if no baseline time standard.
Post by Keller, Jacob E
See if you get the same clock jump issue or not.
Since my ptp4l is in linreg mode, it can not jump my system clock.
To be clear, linreg is just a linear regression servo which could
control the system clock, if you were doing software timestamping. In
this case linreg is controlling the hardware MAC clock.
Hmm, that could be documented better. So the invisible hand inside
turns off the linreg->sysclock connection when hardware timestamping
enabled...
Post by Keller, Jacob E
This doesn't look ugly, you just likely mis-understand the output.
I stand by my statement. Taking a perfectly good receently synced clock
and whacking it by 600mSec is not good. The PLL startup is whacko.
Post by Keller, Jacob E
delay 55328
ptp4l[52370.211]: master offset -134962 s1 freq +599994160 path
delay 55306
ptp4l[52371.211]: master offset -599870879 s2 freq +186802553 path
delay 55306
ptp4l[52371.211]: port 1: UNCALIBRATED to SLAVE on
MASTER_CLOCK_SELECTED
ptp4l[52372.211]: master offset -786797418 s2 freq -149872567 path
delay 55306
ptp4l[52373.211]: master offset -637009628 s2 freq -599999999 path
delay 58203
ptp4l[52374.211]: master offset -37092094 s2 freq -37208915 path
delay 60342
ptp4l[52375.211]: master offset 244012 s2 freq +177926 path
delay 58203
You didn't let it run long enough to stabilize but restarting is probably fine.
Taking long to stablize, from a stable condition, is not fine, possibly
tolerable. As long as the time not passed to ntpshm, which I suspect
it is.
Post by Keller, Jacob E
It is possible the PLL is going nuts on startup?
It's probably due to starting in the weird state you did. If you let
it run long enough it should stabilize.
Weird state? What would you expect after a power cycle? The whole
point of an initialization is nothing is initialized yet!
Post by Keller, Jacob E
If you leave this running do you *ever* see "clock check" errors like
we were seeing before with both phc2sys and ptp4l?
My eyes glaze over. Since ptp4l software mode works I assume that part
works.
Post by Keller, Jacob E
Try building it like "make M=Documentation/ptp/" from the top level.
That worked. Maybe that should be documented?

In any case, I have given up on the i217-LM. I have determined the
82574L and the I210, on the same host, everything else identical work
fine for me.

So the either the i217-LM or the way the e1000e drives it, is buggy.

Imagine some poor Ubuntu user trying to figure that out...

RGDS
GARY
---------------------------------------------------------------------------
Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97701
***@rellim.com Tel:+1(541)382-8588
Keller, Jacob E
2015-02-26 17:41:16 UTC
Permalink
Hi,

Sorry to comment so far back in the thread, but...
Post by Gary E. Miller
Yo Jacob E!
On Tue, 24 Feb 2015 23:31:37 +0000
Post by Keller, Jacob E
Hmm, I see the confusion, I never mentioned my phc2sys offest.
Until it goes to 70,000 seconds....
Correct. I was not assuming that you had another way to measure
system time offset from the master.
Not much of a test if no baseline time standard.
Yes this makes sense.
Post by Gary E. Miller
Post by Keller, Jacob E
See if you get the same clock jump issue or not.
Since my ptp4l is in linreg mode, it can not jump my system clock.
To be clear, linreg is just a linear regression servo which could
control the system clock, if you were doing software timestamping. In
this case linreg is controlling the hardware MAC clock.
Hmm, that could be documented better. So the invisible hand inside
turns off the linreg->sysclock connection when hardware timestamping
enabled...
....

If you use software timestmaps you are controlling the clock that made
those (software) timestamps.

If you do hardware timestamps, you are controlling the clock that made
those (hardware) timestamps.

ptp4l controls one clock. Sometimes you get "lucky" in that you were
doing software timestamps so it could directly control the software
clock. You could also run in free-running mode so that it doesn't
directly control the clock, or you could expose the timestamps as an NTP
SHM and have ntp daemon control the clock.
Post by Gary E. Miller
Post by Keller, Jacob E
This doesn't look ugly, you just likely mis-understand the output.
I stand by my statement. Taking a perfectly good receently synced clock
and whacking it by 600mSec is not good. The PLL startup is whacko.
The beginning is bad, yes. But we actually don't try to jump the servo
until ...
Post by Gary E. Miller
Post by Keller, Jacob E
delay 55328
ptp4l[52370.211]: master offset -134962 s1 freq +599994160 path
delay 55306
ptp4l[52371.211]: master offset -599870879 s2 freq +186802553 path
delay 55306
ptp4l[52371.211]: port 1: UNCALIBRATED to SLAVE on
MASTER_CLOCK_SELECTED
Here. After which it already went bonkers due to the large jump. This is
probably the result of the same driver issue noticed before. Note that
before, it "looked" ok, but the freq value was way off, and once we
tried to reset it, it started going crazy. Again, I think this is sign
of the broken issue with the i217-LM.
Post by Gary E. Miller
Post by Keller, Jacob E
ptp4l[52372.211]: master offset -786797418 s2 freq -149872567 path
delay 55306
ptp4l[52373.211]: master offset -637009628 s2 freq -599999999 path
delay 58203
ptp4l[52374.211]: master offset -37092094 s2 freq -37208915 path
delay 60342
ptp4l[52375.211]: master offset 244012 s2 freq +177926 path
delay 58203
You didn't let it run long enough to stabilize but restarting is probably fine.
Taking long to stablize, from a stable condition, is not fine, possibly
tolerable. As long as the time not passed to ntpshm, which I suspect
it is.
Post by Keller, Jacob E
It is possible the PLL is going nuts on startup?
It's probably due to starting in the weird state you did. If you let
it run long enough it should stabilize.
Weird state? What would you expect after a power cycle? The whole
point of an initialization is nothing is initialized yet!
You power cycled the machine?
Post by Gary E. Miller
Post by Keller, Jacob E
If you leave this running do you *ever* see "clock check" errors like
we were seeing before with both phc2sys and ptp4l?
My eyes glaze over. Since ptp4l software mode works I assume that part
works.
Nope. The clock check was only occuring in your output with hardware
timestamping.
Post by Gary E. Miller
Post by Keller, Jacob E
Try building it like "make M=Documentation/ptp/" from the top level.
That worked. Maybe that should be documented?
That is how the Kernel make system works. When you run make in the top
level and name testphc, you are using implicit makefile rules, which is
why it failed to build the -lrt library inclusion.
Post by Gary E. Miller
In any case, I have given up on the i217-LM. I have determined the
82574L and the I210, on the same host, everything else identical work
fine for me.
Yep.
Post by Gary E. Miller
So the either the i217-LM or the way the e1000e drives it, is buggy.
Imagine some poor Ubuntu user trying to figure that out...
Yep. I'm hoping to get enough information so I can pass this on to the
team that does the driver for the i27-LM and see if they can try to fix
it :)

Regards,
Jake
Post by Gary E. Miller
RGDS
GARY
---------------------------------------------------------------------------
Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97701
Gary E. Miller
2015-02-26 18:16:38 UTC
Permalink
Yo Jacob E!

On Thu, 26 Feb 2015 17:41:16 +0000
Post by Keller, Jacob E
Post by Gary E. Miller
Post by Keller, Jacob E
To be clear, linreg is just a linear regression servo which could
control the system clock, if you were doing software
timestamping. In this case linreg is controlling the hardware MAC
clock.
Hmm, that could be documented better. So the invisible hand inside
turns off the linreg->sysclock connection when hardware timestamping
enabled...
....
If you use software timestmaps you are controlling the clock that made
those (software) timestamps.
If you do hardware timestamps, you are controlling the clock that made
those (hardware) timestamps.
ptp4l controls one clock. Sometimes you get "lucky" in that you were
doing software timestamps so it could directly control the software
clock. You could also run in free-running mode so that it doesn't
directly control the clock, or you could expose the timestamps as an
NTP SHM and have ntp daemon control the clock.
Every time someone says 'the clock" I get lost. I got clocks all
over the place! System clock, hardware timestamp clock, NTP clock...

For now the only way I see to (easily) compare a good PPS clock to
either the the software timestamp clock or the hardware timestamp clock
is with NTP SHM. Otherwise hand computing jitter, offset, etc.
would be a PITA.

You got another easy calibration trick?
Post by Keller, Jacob E
Post by Gary E. Miller
Post by Keller, Jacob E
It is possible the PLL is going nuts on startup?
It's probably due to starting in the weird state you did. If you
let it run long enough it should stabilize.
Weird state? What would you expect after a power cycle? The whole
point of an initialization is nothing is initialized yet!
You power cycled the machine?
Hardware troubleshooting 101, power cycle often.
Post by Keller, Jacob E
Post by Gary E. Miller
Post by Keller, Jacob E
Try building it like "make M=Documentation/ptp/" from the top level.
That worked. Maybe that should be documented?
That is how the Kernel make system works. When you run make in the top
level and name testphc, you are using implicit makefile rules, which
is why it failed to build the -lrt library inclusion.
So that means it does not need to be documented? The Linux Documentation
tree is full of READMEs, just not in ptp. BUt I accept this is not the
right place to complain about that...
Post by Keller, Jacob E
Post by Gary E. Miller
In any case, I have given up on the i217-LM. I have determined the
82574L and the I210, on the same host, everything else identical
work fine for me.
Well, not so fine. The jitter on the 82574L is worse than NTP over the
same link and the I210 has a stubborn 833 mSec offset.

So I'm about to call three strikes and you are out.
Post by Keller, Jacob E
Yep. I'm hoping to get enough information so I can pass this on to the
team that does the driver for the i27-LM and see if they can try to
fix it :)
Great, thanks.

RGDS
GARY
---------------------------------------------------------------------------
Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97701
***@rellim.com Tel:+1(541)382-8588

Richard Cochran
2015-02-26 08:03:27 UTC
Permalink
Just for the record...
See if you get the same clock jump issue or not.
Since my ptp4l is in linreg mode, it can not jump my system clock.
So what should I look for?
# killall ptp4l phc2sys
ptp4l: no process found
phc2sys: no process found
# ptp4l -i eth0 -l 6 -m -f ptp.conf
ptp4l[52360.386]: selected /dev/ptp0 as PTP clock
ptp4l[52360.386]: port 1: INITIALIZING to LISTENING on INITIALIZE
ptp4l[52360.386]: port 0: INITIALIZING to LISTENING on INITIALIZE
ptp4l[52362.144]: port 1: new foreign master 003048.fffe.345fe2-1
ptp4l[52366.144]: selected best master clock 003048.fffe.345fe2
ptp4l[52366.144]: foreign master not using PTP timescale
ptp4l[52366.144]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE
ptp4l[52367.211]: master offset -99132 s0 freq +599999998 path delay 46363
Here the huge initial frequency offset is left over from the previous
run. When starting up, the ptp4l program reads out the existing
offset and uses it as a starting point.
ptp4l[52368.211]: master offset -107849 s0 freq +599999998 path delay 46363
ptp4l[52369.211]: master offset -125869 s0 freq +599999998 path delay 55328
ptp4l[52370.211]: master offset -134962 s1 freq +599994160 path delay 55306
ptp4l[52371.211]: master offset -599870879 s2 freq +186802553 path delay 55306
ptp4l[52371.211]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
ptp4l[52372.211]: master offset -786797418 s2 freq -149872567 path delay 55306
ptp4l[52373.211]: master offset -637009628 s2 freq -599999999 path delay 58203
ptp4l[52374.211]: master offset -37092094 s2 freq -37208915 path delay 60342
ptp4l[52375.211]: master offset 244012 s2 freq +177926 path delay 58203
Here the program exits with a much more reasonable frequency offset.
kong linuxptp # ptp4l -i eth0 -l 6 -m -f ptp.conf
ptp4l[52498.377]: selected /dev/ptp0 as PTP clock
ptp4l[52498.378]: port 1: INITIALIZING to LISTENING on INITIALIZE
ptp4l[52498.378]: port 0: INITIALIZING to LISTENING on INITIALIZE
ptp4l[52500.147]: port 1: new foreign master 003048.fffe.345fe2-1
ptp4l[52504.147]: selected best master clock 003048.fffe.345fe2
ptp4l[52504.147]: foreign master not using PTP timescale
ptp4l[52504.147]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE
ptp4l[52506.215]: master offset -125996 s0 freq +177925 path delay 66149
ptp4l[52507.215]: master offset -114049 s0 freq +177925 path delay 46374
ptp4l[52508.215]: master offset -126964 s0 freq +177925 path delay 49315
ptp4l[52509.215]: master offset -136826 s1 freq +177442 path delay 52187
ptp4l[52510.215]: master offset -188068 s2 freq -21513 path delay 52187
ptp4l[52510.215]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
ptp4l[52511.216]: master offset -176435 s2 freq -97696 path delay 52187
ptp4l[52512.215]: master offset -89938 s2 freq -100192 path delay 55059
ptp4l[52513.216]: master offset 1079 s2 freq -9460 path delay 54937
ptp4l[52514.215]: master offset 1241 s2 freq -8949 path delay 54937
...
It is possible the PLL is going nuts on startup?
No, it just starts playing with really bad cards.

Thanks,
Richard
Richard Cochran
2015-02-25 09:35:41 UTC
Permalink
I have been out with the flu, but let us take a look...
Post by Gary E. Miller
Yo Jacob!
Just to summarize what I just tried, that failed. I repeated several times,
similar results, it just took varying times before going crazy, usually
10 to 90 seconds.
kong ~ # ethtool -T eth0
hardware-transmit (SOF_TIMESTAMPING_TX_HARDWARE)
software-transmit (SOF_TIMESTAMPING_TX_SOFTWARE)
hardware-receive (SOF_TIMESTAMPING_RX_HARDWARE)
software-receive (SOF_TIMESTAMPING_RX_SOFTWARE)
software-system-clock (SOF_TIMESTAMPING_SOFTWARE)
hardware-raw-clock (SOF_TIMESTAMPING_RAW_HARDWARE)
PTP Hardware Clock: 0
off (HWTSTAMP_TX_OFF)
on (HWTSTAMP_TX_ON)
none (HWTSTAMP_FILTER_NONE)
all (HWTSTAMP_FILTER_ALL)
ptpv1-l4-sync (HWTSTAMP_FILTER_PTP_V1_L4_SYNC)
ptpv1-l4-delay-req (HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ)
ptpv2-l4-sync (HWTSTAMP_FILTER_PTP_V2_L4_SYNC)
ptpv2-l4-delay-req (HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ)
ptpv2-l2-sync (HWTSTAMP_FILTER_PTP_V2_L2_SYNC)
ptpv2-l2-delay-req (HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ)
ptpv2-event (HWTSTAMP_FILTER_PTP_V2_EVENT)
ptpv2-sync (HWTSTAMP_FILTER_PTP_V2_SYNC)
ptpv2-delay-req (HWTSTAMP_FILTER_PTP_V2_DELAY_REQ)
kong ~ # killall ptp4l phc2sys
ptp4l: no process found
phc2sys: no process found
kong ~ # killall ptp4l phc2sys
ptp4l: no process found
phc2sys: no process found
kong ~ # cat ptp.conf
[global]
clock_servo linreg
I recommend using the pi servo. See recent post on -users list.
Post by Gary E. Miller
kong ~ # ptp4l -i eth0 -l 7 -m -f ptp.conf &
kong ~ # phc2sys -a -r -E ntpshm -m -M 2
phc2sys[354.145]: uds: sendto failed: No such file or directory
This one is odd, is uds_address not defaulted as documented?
Sadly, add uds_address /var/run/ptp4l to my ptp.conf does not change
anything.
ptp4l[354.146]: selected /dev/ptp0 as PTP clock
ptp4l[354.183]: port 1: INITIALIZING to LISTENING on INITIALIZE
ptp4l[354.183]: port 0: INITIALIZING to LISTENING on INITIALIZE
ptp4l[354.570]: port 1: setting asCapable
phc2sys[355.146]: Waiting for ptp4l...
ptp4l[355.197]: port 0: setting asCapable
ptp4l[355.674]: port 1: new foreign master 003048.fffe.345fe2-1
phc2sys[356.198]: reconfiguring after port state change
phc2sys[356.198]: selecting eth0 for synchronization
phc2sys[356.198]: nothing to synchronize
ptp4l[359.341]: selected best master clock 003048.fffe.345fe2
ptp4l[359.341]: foreign master not using PTP timescale
Your master isn't using the PTP timescale? That is suspicious. What
is your grand master?
Post by Gary E. Miller
ptp4l[359.341]: port 1: LISTENING to UNCALIBRATED on RS_SLAVE
ptp4l[359.361]: port 1: delay timeout
ptp4l[359.526]: port 1: delay timeout
phc2sys[360.198]: port 002590.fffe.f355da-1 changed state
phc2sys[360.198]: reconfiguring after port state change
phc2sys[360.198]: master clock not ready, waiting...
ptp4l[360.352]: port 1: delay timeout
ptp4l[360.353]: path delay 58263 58263
ptp4l[360.987]: master offset 16506354212 s0 freq -0 path delay 58263
The path delay is enormous. I guess your master uses software time
stamping, or you have several switches in line.
Post by Gary E. Miller
ptp4l[361.746]: port 1: delay timeout
ptp4l[361.746]: path delay 59135 60008
ptp4l[361.904]: master offset 16506344102 s0 freq -0 path delay 59135
ptp4l[362.820]: master offset 16506334623 s0 freq -0 path delay 59135
ptp4l[363.681]: port 1: delay timeout
ptp4l[363.681]: path delay 58263 47033
ptp4l[363.737]: linreg: points 4 slope 1.000008825 intercept -16506326986 err 0
ptp4l[363.737]: master offset 16506327955 s1 freq -9794 path delay 58263
ptp4l[364.654]: linreg: points 4 slope 1.000008873 intercept 613 err 1412
ptp4l[364.654]: master offset -1412 s2 freq -9485 path delay 58263
ptp4l[364.654]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
phc2sys[365.199]: port 002590.fffe.f355da-1 changed state
phc2sys[365.199]: reconfiguring after port state change
phc2sys[365.199]: selecting CLOCK_REALTIME for synchronization
phc2sys[365.199]: selecting eth0 as the master clock
phc2sys[365.199]: phc offset -70353239245525 s0 freq +0 delay 1348
WTF was that???
Looking two lines further...
Post by Gary E. Miller
ptp4l[365.571]: clockcheck: clock jumped forward or running faster than expected!
ptp4l[365.571]: master offset 70368744176888 s0 freq -9485 path delay 58263
The ptp4l program has measured a 19.5 hour offset between the local
PHC and the GM clock.

There are three possible explanations:

1. Someone reset the PTP Hardware Clock (PHC) behind our backs.

2. Someone reset the grand master's clock.

3. The GM is buggy and delivers broken time stamps.

Inspecting the protocol time stamps (wireshark, tcpdump) will narrow
it down to #1 versus 2/3.

The whole rest of the trace just shows how ptp4l tries to correct an
almost 20 hour offset.

At this point, I would investigate the cause of the sudden, huge
offset. You can test the problem using ptp4l in isolation. No point
in running phc2sys if you've got such massive GM errors.

Thanks,
Richard
Post by Gary E. Miller
ptp4l[365.571]: port 1: SLAVE to UNCALIBRATED on SYNCHRONIZATION_FAULT
ptp4l[366.164]: port 1: delay timeout
ptp4l[366.164]: path delay 58096 57929
phc2sys[366.199]: port 002590.fffe.f355da-1 changed state
phc2sys[366.199]: reconfiguring after port state change
phc2sys[366.199]: master clock not ready, waiting...
ptp4l[366.487]: master offset 70368744178475 s0 freq -9485 path delay 58096
ptp4l[367.404]: master offset 70368744179257 s0 freq -9485 path delay 58096
ptp4l[367.495]: port 1: delay timeout
ptp4l[367.495]: path delay 58263 61062
ptp4l[368.321]: linreg: points 4 slope 1.000008584 intercept -70368744179915 err 1412
ptp4l[368.321]: master offset 70368744179632 s2 freq +599999999 path delay 58263
ptp4l[368.321]: port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
ptp4l[368.810]: port 1: delay timeout
ptp4l[368.810]: negative path delay -69667
ptp4l[368.810]: path_delay = (t2 - t3) * rr + (t4 - t1) - (c1 + c2 + c3)
ptp4l[368.810]: t2 - t3 = -213724133
ptp4l[368.810]: t4 - t1 = +534175583
ptp4l[368.810]: rr = 2.500021454
ptp4l[368.810]: c1 0
ptp4l[368.810]: c2 0
ptp4l[368.810]: c3 0
ptp4l[368.810]: path delay 58096 -69667
ptp4l[369.140]: port 1: delay timeout
ptp4l[369.140]: negative path delay -60229
ptp4l[369.140]: path_delay = (t2 - t3) * rr + (t4 - t1) - (c1 + c2 + c3)
ptp4l[369.140]: t2 - t3 = -357590119
ptp4l[369.140]: t4 - t1 = +893862511
ptp4l[369.140]: rr = 2.500021454
ptp4l[369.140]: c1 0
ptp4l[369.140]: c2 0
ptp4l[369.140]: c3 0
ptp4l[369.140]: path delay 57929 -60229
phc2sys[369.199]: port 002590.fffe.f355da-1 changed state
phc2sys[369.199]: reconfiguring after port state change
phc2sys[369.199]: selecting CLOCK_REALTIME for synchronization
phc2sys[369.199]: selecting eth0 as the master clock
phc2sys[369.199]: phc offset -70353027879345 s0 freq +0 delay 1391
ptp4l[369.237]: linreg: points 4 slope 0.999941194 intercept -70368144182166 err 225182
ptp4l[369.237]: master offset 70368144249873 s2 freq +599999999 path delay 57929
ptp4l[370.073]: port 1: delay timeout
ptp4l[370.073]: path delay 58096 71844
ptp4l[370.154]: linreg: points 4 slope 0.999918331 intercept -70367544242678 err 113137
ptp4l[370.154]: master offset 70367544220469 s2 freq +599999999 path delay 58096
phc2sys[370.199]: phc offset -70352464198801 s0 freq +0 delay 1345
ptp4l[371.071]: linreg: points 4 slope 0.999940831 intercept -70366944235752 err 113128
ptp4l[371.071]: master offset 70366944190386 s2 freq +599999999 path delay 58096
phc2sys[371.199]: phc offset -70351900518690 s0 freq +0 delay 1379
ptp4l[371.706]: port 1: delay timeout
ptp4l[371.706]: path delay 58263 80317
ptp4l[371.987]: linreg: points 4 slope 1.000008261 intercept -70366344169510 err 112836
ptp4l[371.987]: master offset 70366344169768 s2 freq +599999999 path delay 58263
phc2sys[372.200]: phc offset -70351336831824 s0 freq +0 delay 1347
ptp4l[372.904]: linreg: points 4 slope 1.000008717 intercept -70365744150943 err 90551
ptp4l[372.904]: master offset 70365744150552 s2 freq +599999999 path delay 58263
phc2sys[373.200]: phc offset -70350773141966 s0 freq +0 delay 1388
ptp4l[373.617]: port 1: delay timeout
ptp4l[373.617]: path delay 58103 57944
ptp4l[373.821]: linreg: points 4 slope 1.000008674 intercept -70365144126319 err 75472
ptp4l[373.821]: master offset 70365144126321 s2 freq +599999999 path delay 58103
phc2sys[374.200]: phc offset -70350209474225 s0 freq +0 delay 1348
ptp4l[374.738]: linreg: points 4 slope 1.000008812 intercept -70364544099292 err 64723
ptp4l[374.738]: master offset 70364544099571 s2 freq +599999999 path delay 58103
ptp4l[374.831]: port 1: delay timeout
ptp4l[374.831]: path delay 58976 62467
phc2sys[375.200]: phc offset -70349645810969 s0 freq +0 delay 1347
ptp4l[375.654]: linreg: points 4 slope 1.000008650 intercept -70363944062902 err 56663
ptp4l[375.654]: master offset 70363944062598 s2 freq +599999999 path delay 58976
ptp4l[375.923]: port 1: delay timeout
ptp4l[375.923]: path delay 57936 53829
phc2sys[376.200]: phc offset -70349082116188 s0 freq +0 delay 1460
ptp4l[376.571]: linreg: points 4 slope 1.000008152 intercept -70363344039615 err 50595
ptp4l[376.571]: master offset 70363344040348 s2 freq +599999999 path delay 57936
phc2sys[377.200]: phc offset -70348518427659 s0 freq +0 delay 1343
ptp4l[377.488]: linreg: points 4 slope 1.000007828 intercept -70362744024915 err 45586
ptp4l[377.488]: master offset 70362744024897 s2 freq +599999999 path delay 57936
ptp4l[377.807]: port 1: delay timeout
ptp4l[377.807]: path delay 59445 60946
phc2sys[378.200]: phc offset -70347954751340 s0 freq +0 delay 1353
ptp4l[378.404]: linreg: points 4 slope 1.000008160 intercept -70362143984809 err 44713
ptp4l[378.404]: master offset 70362143983983 s2 freq +599999999 path delay 59445
phc2sys[379.200]: phc offset -70347391089799 s0 freq +0 delay 1388
ptp4l[379.321]: linreg: points 4 slope 1.000009185 intercept -70361543971115 err 43857
ptp4l[379.321]: master offset 70361543970987 s2 freq +599999999 path delay 59445
ptp4l[379.707]: port 1: delay timeout
ptp4l[379.707]: path delay 59445 42090
phc2sys[380.200]: phc offset -70346827428315 s0 freq +0 delay 1462
ptp4l[380.238]: linreg: points 4 slope 1.000009091 intercept -70360943944337 err 42997
ptp4l[380.238]: master offset 70360943944743 s2 freq +599999999 path delay 59445
ptp4l[380.465]: port 1: delay timeout
ptp4l[380.465]: path delay 55886 46490
ptp4l[381.155]: linreg: points 8 slope 1.000008246 intercept -70360343906082 err 33333
ptp4l[381.155]: master offset 70360343908136 s2 freq +599999999 path delay 55886
phc2sys[381.200]: phc offset -70346263731002 s0 freq +0 delay 1463
ptp4l[381.758]: port 1: delay timeout
ptp4l[381.759]: path delay 55886 45742
ptp4l[382.071]: linreg: points 8 slope 1.000008035 intercept -70359743870239 err 32710
ptp4l[382.071]: master offset 70359743871471 s2 freq +599999999 path delay 55886
phc2sys[382.200]: phc offset -70345700043849 s0 freq +0 delay 1464
ptp4l[382.451]: port 1: delay timeout
ptp4l[382.452]: path delay 56747 55551
ptp4l[382.988]: linreg: points 8 slope 1.000008136 intercept -70359143861734 err 32071
ptp4l[382.988]: master offset 70359143861383 s2 freq +599999999 path delay 56747
phc2sys[383.200]: phc offset -70345136355504 s0 freq +0 delay 1461
ptp4l[383.905]: linreg: points 8 slope 1.000008152 intercept -70358543843295 err 31462
ptp4l[383.905]: master offset 70358543842094 s2 freq +599999999 path delay 56747
ptp4l[384.188]: port 1: delay timeout
ptp4l[384.188]: path delay 54690 39523
phc2sys[384.200]: phc offset -70344572667058 s0 freq +0 delay 1350
ptp4l[384.313]: port 1: delay timeout
ptp4l[384.314]: path delay 53598 53368
ptp4l[384.821]: linreg: points 8 slope 1.000007880 intercept -70357943816603 err 30869
ptp4l[384.821]: master offset 70357943817385 s2 freq +599999999 path delay 53598
ptp4l[385.156]: port 1: delay timeout
ptp4l[385.156]: path delay 53598 57616
phc2sys[385.200]: phc offset -70344008980381 s0 freq +0 delay 1464
ptp4l[385.738]: linreg: points 8 slope 1.000007783 intercept -70357343791409 err 30264
ptp4l[385.738]: master offset 70357343791685 s2 freq +599999999 path delay 53598
phc2sys[386.201]: phc offset -70343445298408 s0 freq +0 delay 1466
ptp4l[386.655]: linreg: points 8 slope 1.000007932 intercept -70356743766393 err 29678
ptp4l[386.655]: master offset 70356743766013 s2 freq +599999999 path delay 53598
ptp4l[386.967]: port 1: delay timeout
ptp4l[386.968]: path delay 51744 50120
phc2sys[387.201]: phc offset -70342881625905 s0 freq +0 delay 1352
ptp4l[387.571]: linreg: points 8 slope 1.000007910 intercept -70356143740286 err 29130
ptp4l[387.571]: master offset 70356143742033 s2 freq +599999999 path delay 51744
I stopped it here as it tried to step my good clock by -70356s.
RGDS
GARY
---------------------------------------------------------------------------
Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97701
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Linuxptp-devel mailing list
https://lists.sourceforge.net/lists/listinfo/linuxptp-devel
Gary E. Miller
2015-02-26 01:06:23 UTC
Permalink
Yo Richard!

On Wed, 25 Feb 2015 10:35:41 +0100
Post by Richard Cochran
I have been out with the flu, but let us take a look...
Well, don't get up on my account. :-)
Post by Richard Cochran
Post by Gary E. Miller
kong ~ # cat ptp.conf
[global]
clock_servo linreg
I recommend using the pi servo. See recent post on -users list.
Reading that thread, Miroslav recommends linreg for the casual user.
Post by Richard Cochran
Post by Gary E. Miller
ptp4l[359.341]: selected best master clock 003048.fffe.345fe2
ptp4l[359.341]: foreign master not using PTP timescale
Your master isn't using the PTP timescale? That is suspicious. What
is your grand master?
Here is how I run my grand master:

ptp4l -f /usr/local/etc/ptp4l.conf &

With this config file:

[global]
time_stamping software
summary_interval 10
clock_servo ntpshm
ntpshm_segment 2

# default priority = 128
priority1 10
priority2 10

[eth0]

Suuggestions welcome.
Post by Richard Cochran
The path delay is enormous. I guess your master uses software time
stamping, or you have several switches in line.
Yes, and yes. Hardware timestamping being a bear for me. 3 switches
in the path. The 6uS jitter I see seems pretty good to me.
Post by Richard Cochran
Post by Gary E. Miller
WTF was that???
Looking two lines further...
Post by Gary E. Miller
ptp4l[365.571]: clockcheck: clock jumped forward or running
faster than expected! ptp4l[365.571]: master offset 70368744176888
s0 freq -9485 path delay 58263
The ptp4l program has measured a 19.5 hour offset between the local
PHC and the GM clock.
1. Someone reset the PTP Hardware Clock (PHC) behind our backs.
Not possible.
Post by Richard Cochran
2. Someone reset the grand master's clock.
Not possible.
Post by Richard Cochran
3. The GM is buggy and delivers broken time stamps.
Not possible.

On further testing, replacing the i217-LM with a 82574L, bth using the
same e1000e driver fixed the problem.

So, explanations 4 and 5

4. Buggy NIC

5. Buggy NIC driver
Post by Richard Cochran
Inspecting the protocol time stamps (wireshark, tcpdump) will narrow
it down to #1 versus 2/3.
But not 4 versus 5.
Post by Richard Cochran
The whole rest of the trace just shows how ptp4l tries to correct an
almost 20 hour offset.
Which is itself pretty laughable and merits its own bug.
Post by Richard Cochran
At this point, I would investigate the cause of the sudden, huge
offset. You can test the problem using ptp4l in isolation. No point
in running phc2sys if you've got such massive GM errors.
Done yesterday, email sent to list with results. Nothing obvious.

RGDS
GARY
---------------------------------------------------------------------------
Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97701
***@rellim.com Tel:+1(541)382-8588
Miroslav Lichvar
2015-02-24 10:29:14 UTC
Permalink
Post by Gary E. Miller
ptp4l[365.571]: clockcheck: clock jumped forward or running
faster than expected!
Looks like something else than ptp4l is touching the PHC.
How can that be? I do "killall ptp4l phc2sys" in my test script.
Oh, one other thing. Sometimes after running a timestamp hardware
test I can not revert to timestamp software and get a good time.
After my last test I had a persistent 150mS offset from ptp4l that
would not go away. Killing and restarting ptp4l did not help. I
had to reboot to get back to good time.
These sound like a driver bug to me. What kernel and NIC do you have?
Perhaps Richard might have some suggestions on how to debug this.
--
Miroslav Lichvar
Gary E. Miller
2015-02-24 18:35:15 UTC
Permalink
Yo Miroslav!

On Tue, 24 Feb 2015 11:29:14 +0100
Post by Miroslav Lichvar
These sound like a driver bug to me. What kernel and NIC do you have?
Kernel is 3.19.0

Here is the NIC:

00:19.0 Ethernet controller: Intel Corporation Ethernet Connection I217-LM (rev 05)
DeviceName: Intel Ethernet i217LM #1
Subsystem: Super Micro Computer Inc Ethernet Connection I217-LM
Flags: bus master, fast devsel, latency 0, IRQ 35
Memory at f7f00000 (32-bit, non-prefetchable) [size=128K]
Memory at f7f32000 (32-bit, non-prefetchable) [size=4K]
I/O ports at f020 [size=32]
Capabilities: [c8] Power Management version 2
Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Capabilities: [e0] PCI Advanced Features
Kernel driver in use: e1000e
Post by Miroslav Lichvar
Perhaps Richard might have some suggestions on how to debug this.
I'll take most any suggestion.

RGDS
GARY
---------------------------------------------------------------------------
Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97701
***@rellim.com Tel:+1(541)382-8588
Loading...