Discussion:
[Linuxptp-devel] [PATCH RFC 2/2] phc2sys: use the system offset method if available
Richard Cochran
2012-11-26 12:28:14 UTC
Permalink
We use the PTP_SYS_OFFSET ioctl method in preference to doing
clock_gettime from user space, the assumption being that the kernel space
measurement will always be more accurate.

Signed-off-by: Richard Cochran <***@gmail.com>
---
makefile | 4 ++--
phc2sys.c | 25 +++++++++++++++++++++++++
2 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/makefile b/makefile
index 43a58b3..afa5021 100644
--- a/makefile
+++ b/makefile
@@ -31,7 +31,7 @@ PRG = ptp4l pmc phc2sys hwstamp_ctl
OBJ = bmc.o clock.o config.o fsm.o ptp4l.o mave.o msg.o phc.o pi.o port.o \
print.o raw.o servo.o sk.o tlv.o tmtab.o transport.o udp.o udp6.o uds.o util.o

-OBJECTS = $(OBJ) pmc.o phc2sys.o hwstamp_ctl.o
+OBJECTS = $(OBJ) pmc.o phc2sys.o hwstamp_ctl.o sysoff.o
SRC = $(OBJECTS:.o=.c)
DEPEND = $(OBJECTS:.o=.d)
srcdir := $(dir $(lastword $(MAKEFILE_LIST)))
@@ -48,7 +48,7 @@ ptp4l: $(OBJ)

pmc: pmc.o msg.o print.o raw.o sk.o tlv.o transport.o udp.o udp6.o uds.o util.o

-phc2sys: phc2sys.o sk.o print.o
+phc2sys: phc2sys.o sk.o sysoff.o print.o

hwstamp_ctl: hwstamp_ctl.o

diff --git a/phc2sys.c b/phc2sys.c
index a6586a2..19b7803 100644
--- a/phc2sys.c
+++ b/phc2sys.c
@@ -34,6 +34,7 @@

#include "missing.h"
#include "sk.h"
+#include "sysoff.h"

#define KP 0.7
#define KI 0.3
@@ -228,6 +229,26 @@ static int do_pps_loop(char *pps_device, double kp, double ki, clockid_t dst)
return 0;
}

+static int do_sysoff_loop(clockid_t src, clockid_t dst,
+ int rate, int n_readings, int sync_offset,
+ double kp, double ki)
+{
+ uint64_t ts;
+ int64_t offset;
+ int err = 0, fd = CLOCKID_TO_FD(src);
+ while (1) {
+ usleep(1000000 / rate);
+ if (sysoff_measure(fd, n_readings, &offset, &ts)) {
+ err = -1;
+ break;
+ }
+ offset -= sync_offset * NS_PER_SEC;
+ do_servo(&servo, dst, offset, ts, kp, ki);
+ show_servo(stdout, "sys", offset, ts);
+ }
+ return err;
+}
+
static void usage(char *progname)
{
fprintf(stderr,
@@ -323,6 +344,10 @@ int main(int argc, char *argv[])
if (device)
return do_pps_loop(device, kp, ki, dst);

+ if (SYSOFF_SUPPORTED == sysoff_probe(CLOCKID_TO_FD(src)))
+ return do_sysoff_loop(src, dst, phc_rate,
+ phc_readings, sync_offset, kp, ki);
+
while (1) {
usleep(1000000 / phc_rate);
if (!read_phc(src, dst, phc_readings, &phc_offset, &phc_ts)) {
--
1.7.2.5
Miroslav Lichvar
2012-11-26 14:05:32 UTC
Permalink
---
sysoff.c | 104 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
sysoff.h | 44 ++++++++++++++++++++++++++
2 files changed, 148 insertions(+), 0 deletions(-)
+int sysoff_measure(int fd, int n_samples, int64_t *result, uint64_t *ts)
+{
+ struct ptp_sys_offset pso;
+ pso.n_samples = n_samples;
+ if (ioctl(fd, PTP_SYS_OFFSET, &pso)) {
+ perror("ioctl PTP_SYS_OFFSET");
+ return SYSOFF_RUN_TIME_MISSING;
+ }
+ *result = sysoff_estimate(pso.ts, n_samples, ts);
+ return SYSOFF_SUPPORTED;
+}
I think it would be useful to print a more descriptive error message,
when n_samples is larger than PTP_MAX_SAMPLES. Or perhaps include
n_samples in the probe and fall back to the user space sampling with a
warning message?

Just curious, what improvements in the jitter with different n_samples
do you see with the new syscall? (I've not had a chance to try it yet)
--
Miroslav Lichvar
Richard Cochran
2012-11-27 07:22:37 UTC
Permalink
Post by Miroslav Lichvar
I think it would be useful to print a more descriptive error message,
when n_samples is larger than PTP_MAX_SAMPLES. Or perhaps include
n_samples in the probe and fall back to the user space sampling with a
warning message?
Okay, thanks for the comments in your two emails.
Post by Miroslav Lichvar
Just curious, what improvements in the jitter with different n_samples
do you see with the new syscall? (I've not had a chance to try it yet)
IIRC, there was almost no difference for the igb. There was a
noticable improvement for the phyter (on coldfire, via mdio).
I will repeat the measurements more carefully and let you know.

Thanks,
Richard
Richard Cochran
2012-11-28 11:03:26 UTC
Permalink
Post by Miroslav Lichvar
Just curious, what improvements in the jitter with different n_samples
do you see with the new syscall? (I've not had a chance to try it yet)
Okay, I made some measurements using these two patches, plus the patch
below, in order to compare the clock_gettime method with the sys_offset
ioctl method. I published the results under the linuxptp SF page.

http://linuxptp.sourceforge.net/gettime_vs_sysoff/

There is also a gnuplot script for plotting the traces. You need to
edit the top of the file to change files.

Here are a few things I noticed, looking at the graphs.

* The phc2sys servo works well enough with the weights P=0.1 and
I=0.001, considering the noisy inputs. This is apparent by looking
at the drift, which is nice and steady within about 3 ppm. Based on
this observation, I think that we should change the phc2sys default
weights.

* There is really not much difference between the gettime method and
the sys_offset method. The RMS error between the two is as follows.

Board | RMS
-----------------------------+---------
PHYTER on the m5234bcc | 25571.10
BeagleBone | 1058.74
IGB in a 2 CPU Pentium D | 21.16

Looking at the phyter, it isn't really clear whether the gettime
method is worse or not. Based on this observation, I think we should
keep clock_gettime as the the default (since it always is available)
and offer the sys_offset method as a command line option.

* During each trace, I generated a high system load using a script
like the following, with the expectation of making the offset
estimations worse.

#!/bin/sh
while [ 1 ] ; do
find /usr
dd if=/dev/zero of=/dev/null count=1000000
./hackbench -g 1 -T 4 -s 1000 -l 1000
./calibrator 400 1M /tmp/out
done

The readme.org shows where the load starts and stops in each
case. For the coldfire board, I had to leave out both hackbench and
calibrator, since they don't compile on uClinux.

The surprising effect of the load is that is dramatically improves
the estimations. I am not sure why this happens, but one idea is
that the CPU cache actually hurts the clock readings, since the
second system time reading goes faster.

This also explains the better performance that you observed when
running the phc2sys at 100 Hz.

Thanks,
Richard

---
commit 27441a863e429f0f06d51c054a30076265c7b2f0
Author: Richard Cochran <***@gmail.com>
Date: Tue Nov 27 10:43:51 2012 +0100

hack: compare clock_gettime with sys_offset.

diff --git a/phc2sys.c b/phc2sys.c
index 19b7803..c2db02a 100644
--- a/phc2sys.c
+++ b/phc2sys.c
@@ -344,18 +344,27 @@ int main(int argc, char *argv[])
if (device)
return do_pps_loop(device, kp, ki, dst);

- if (SYSOFF_SUPPORTED == sysoff_probe(CLOCKID_TO_FD(src)))
- return do_sysoff_loop(src, dst, phc_rate,
- phc_readings, sync_offset, kp, ki);
+ if (SYSOFF_SUPPORTED != sysoff_probe(CLOCKID_TO_FD(src)))
+ return -1;

while (1) {
usleep(1000000 / phc_rate);
+
if (!read_phc(src, dst, phc_readings, &phc_offset, &phc_ts)) {
continue;
}
phc_offset -= sync_offset * NS_PER_SEC;
+
+ fprintf(stdout, "gettime %9lld ", phc_offset);
+
+ if (sysoff_measure(CLOCKID_TO_FD(src), phc_readings,
+ &phc_offset, &phc_ts)) {
+ break;
+ }
+ phc_offset -= sync_offset * NS_PER_SEC;
+
do_servo(&servo, dst, phc_offset, phc_ts, kp, ki);
- show_servo(stdout, "phc", phc_offset, phc_ts);
+ show_servo(stdout, "sys_offset", phc_offset, phc_ts);
}
return 0;
}
Miroslav Lichvar
2012-11-28 15:18:38 UTC
Permalink
Post by Richard Cochran
Post by Miroslav Lichvar
Just curious, what improvements in the jitter with different n_samples
do you see with the new syscall? (I've not had a chance to try it yet)
Okay, I made some measurements using these two patches, plus the patch
below, in order to compare the clock_gettime method with the sys_offset
ioctl method. I published the results under the linuxptp SF page.
http://linuxptp.sourceforge.net/gettime_vs_sysoff/
There is also a gnuplot script for plotting the traces. You need to
edit the top of the file to change files.
Thanks for the data, very interesting!
Post by Richard Cochran
Here are a few things I noticed, looking at the graphs.
* The phc2sys servo works well enough with the weights P=0.1 and
I=0.001, considering the noisy inputs. This is apparent by looking
at the drift, which is nice and steady within about 3 ppm. Based on
this observation, I think that we should change the phc2sys default
weights.
Yes, I think that's a good idea. What formula do you use to derive the
coefficients to keep the controller stable?
Post by Richard Cochran
* There is really not much difference between the gettime method and
the sys_offset method. The RMS error between the two is as follows.
Board | RMS
-----------------------------+---------
PHYTER on the m5234bcc | 25571.10
BeagleBone | 1058.74
IGB in a 2 CPU Pentium D | 21.16
Looking at the phyter, it isn't really clear whether the gettime
method is worse or not. Based on this observation, I think we should
keep clock_gettime as the the default (since it always is available)
and offer the sys_offset method as a command line option.
Does the situation change when -N is set to 2 or some small number?
I'd expect the main advantage of sys_offset to be better stability
with small n_samples.

I sometimes find useful to look also at the distribution of the
samples instead of just comparing RMS. The scripts I use for that are
in the attachment.
Post by Richard Cochran
The surprising effect of the load is that is dramatically improves
the estimations. I am not sure why this happens, but one idea is
that the CPU cache actually hurts the clock readings, since the
second system time reading goes faster.
This also explains the better performance that you observed when
running the phc2sys at 100 Hz.
The explanation for the better performance I saw with 100 Hz was
actually a bug in the tsc clocksource with NO_HZ. When the system was
idle, the tsc->time conversion had a large error. Booting with
nohz=off fixed it. I was told this bug was fixed in some 3.7 rc, but
perhaps only on x86_64, I'm not really sure.

This code demonstrates the problem:

http://people.redhat.com/mlichvar/tmp/tstsc.c

Thanks,
--
Miroslav Lichvar
Richard Cochran
2012-11-29 09:48:13 UTC
Permalink
Post by Miroslav Lichvar
Yes, I think that's a good idea. What formula do you use to derive the
coefficients to keep the controller stable?
I just used the values from the original ptpd.
Post by Miroslav Lichvar
Does the situation change when -N is set to 2 or some small number?
I'd expect the main advantage of sys_offset to be better stability
with small n_samples.
I will try that out.
Post by Miroslav Lichvar
I sometimes find useful to look also at the distribution of the
samples instead of just comparing RMS. The scripts I use for that are
in the attachment.
Yes, I will look at that, too.

Thanks,
Richard

Miroslav Lichvar
2012-11-26 14:50:47 UTC
Permalink
Post by Richard Cochran
We use the PTP_SYS_OFFSET ioctl method in preference to doing
clock_gettime from user space, the assumption being that the kernel space
measurement will always be more accurate.
---
makefile | 4 ++--
phc2sys.c | 25 +++++++++++++++++++++++++
2 files changed, 27 insertions(+), 2 deletions(-)
@@ -323,6 +344,10 @@ int main(int argc, char *argv[])
if (device)
return do_pps_loop(device, kp, ki, dst);
+ if (SYSOFF_SUPPORTED == sysoff_probe(CLOCKID_TO_FD(src)))
+ return do_sysoff_loop(src, dst, phc_rate,
+ phc_readings, sync_offset, kp, ki);
+
I think the check should include "dst == CLOCK_REALTIME &&". The
reverse direction could be supported with another loop, but I'm
not sure it's worth it or if it won't be obsoleted soon by extending
the ioctl to support any pair of clocks.

Other than that, both patches look good to me. I haven't done a
real test though.

Thanks,
--
Miroslav Lichvar
Keller, Jacob E
2012-11-26 23:42:07 UTC
Permalink
-----Original Message-----
Sent: Monday, November 26, 2012 4:28 AM
Subject: [Linuxptp-devel] [PATCH RFC 2/2] phc2sys: use the system offset
method if available
We use the PTP_SYS_OFFSET ioctl method in preference to doing
clock_gettime from user space, the assumption being that the kernel space
measurement will always be more accurate.
---
makefile | 4 ++--
phc2sys.c | 25 +++++++++++++++++++++++++
2 files changed, 27 insertions(+), 2 deletions(-)
diff --git a/makefile b/makefile
index 43a58b3..afa5021 100644
--- a/makefile
+++ b/makefile
@@ -31,7 +31,7 @@ PRG = ptp4l pmc phc2sys hwstamp_ctl
OBJ = bmc.o clock.o config.o fsm.o ptp4l.o mave.o msg.o phc.o pi.o port.o \
print.o raw.o servo.o sk.o tlv.o tmtab.o transport.o udp.o udp6.o uds.o util.o
-OBJECTS = $(OBJ) pmc.o phc2sys.o hwstamp_ctl.o
+OBJECTS = $(OBJ) pmc.o phc2sys.o hwstamp_ctl.o sysoff.o
SRC = $(OBJECTS:.o=.c)
DEPEND = $(OBJECTS:.o=.d)
srcdir := $(dir $(lastword $(MAKEFILE_LIST)))
@@ -48,7 +48,7 @@ ptp4l: $(OBJ)
pmc: pmc.o msg.o print.o raw.o sk.o tlv.o transport.o udp.o udp6.o uds.o util.o
-phc2sys: phc2sys.o sk.o print.o
+phc2sys: phc2sys.o sk.o sysoff.o print.o
hwstamp_ctl: hwstamp_ctl.o
diff --git a/phc2sys.c b/phc2sys.c
index a6586a2..19b7803 100644
--- a/phc2sys.c
+++ b/phc2sys.c
@@ -34,6 +34,7 @@
#include "missing.h"
#include "sk.h"
+#include "sysoff.h"
#define KP 0.7
#define KI 0.3
@@ -228,6 +229,26 @@ static int do_pps_loop(char *pps_device, double kp,
double ki, clockid_t dst)
return 0;
}
+static int do_sysoff_loop(clockid_t src, clockid_t dst,
+ int rate, int n_readings, int sync_offset,
+ double kp, double ki)
+{
+ uint64_t ts;
+ int64_t offset;
+ int err = 0, fd = CLOCKID_TO_FD(src);
+ while (1) {
+ usleep(1000000 / rate);
+ if (sysoff_measure(fd, n_readings, &offset, &ts)) {
+ err = -1;
+ break;
+ }
+ offset -= sync_offset * NS_PER_SEC;
+ do_servo(&servo, dst, offset, ts, kp, ki);
+ show_servo(stdout, "sys", offset, ts);
+ }
+ return err;
+}
+
static void usage(char *progname)
{
fprintf(stderr,
@@ -323,6 +344,10 @@ int main(int argc, char *argv[])
if (device)
return do_pps_loop(device, kp, ki, dst);
+ if (SYSOFF_SUPPORTED == sysoff_probe(CLOCKID_TO_FD(src)))
+ return do_sysoff_loop(src, dst, phc_rate,
+ phc_readings, sync_offset, kp, ki);
+
Was going to ask how you checked for if it was supported. Looks good to me then.

I will try to get some testing of this in tomorrow, and see what sort of results it gets.

Thanks

- Jake
while (1) {
usleep(1000000 / phc_rate);
if (!read_phc(src, dst, phc_readings, &phc_offset, &phc_ts)) {
--
1.7.2.5
Loading...