illumos news for May
Flag Day from Dan McDonald
FLAG DAY - 4719 affects nightly, package, and poold mail came to several mailing lists. Dan McDonald announces that #4719 update gate build environment to [open]jdk7 commited at commit 4d0eb50e691de4c20b1dd9976ad6839fede8a42d to illumos-gate introduces necessity to install and use JDK7. As noted by him, OpenIndiana 151a9 does not have this version by default. Besides installing [open]JDK7 on OpenIndiana 151a9 you'll also need several other steps (quoting mail from Dan):
- "either set JAVA_ROOT to a source of JDK7, or must have /usr/java populated with JDK7"
- "because poold defines JAVA_ROOT in its binaries, you must set JAVA_ROOT when building poold to match the runtime java on your ONU or otherwised-packaged target"
- "IMPORTANT --> If you are an OI 151a9 user, and wish to use poold, installing openjdk7 in instances is not sufficient. You will need to set /usr/java to point to the openjdk7 instance as well. Illumos bug 5851 tracks this"
Right after this mail, Dan sent another one: HEADS UP -- illumos-gate can now be built on OmniOS r151014 or later. Be sure to read the mail as there are important pointers within.
For everybody facing a problem with NFS exhausting worker threads and system memory
there's a cross post from OmniOS mailing list from Dan McDonald.
On the toes of this discussion Marcel sent mail asking for review of 3783: Flow control is needed in rpcmod when the NFS server is unable to keep up with the network. As noted by Marcel, it has been reviewed internally at Nexenta about two years ago. It got pretty enthusiastic answer from Dan McDonald:
This fix has also helped at least on OmniOS user unscrew his previously untenable NFS situation. I reviewed it in Nexenta, and I reiterate now: Ship it! Dan
With this commit from Gordon Ross:
smb service can now be run from within a non-global zone. The announce mail from Gordon is under this link.
More from Gordon Ross on SMB:
Gordon created issue 5917: User-mode SMB server. It's a follow up on work done at Nexenta to allow for easier development of smb code:
Development of the SMB server kernel code can be accelerated significantly by allowing the code to run in user space, where one can use not only the usual dtrace and mdb tools, but also full source-level debugging etc. This work was developed at Nexenta in preparation for major work in the SMB server to add SMB2 support and other modernization. Architecturally, there are just three major parts of this work[...]
You can read whole issue description for more details.
A big thing came from Dan McDonald:
4719 enables illumos-gate building on OmniOS. Announce mail under the link. It's quite a thing for illumos to not be tied to a particular distribution for building. Following the commit, Dan edited How To Build illumos wiki page to include instructions for using OmniOS as build OS for illumos.
For all that were looking forward for vanity naming in zones, here goes 5877 from Robert Mustacchi:
As a reminder, vanity naming allows to give network devices arbitrary names.
Robert Mustacchi also prototyped two new mutexes for the userland, basing their behaviour on kernel mutexes.
Excerpt from his idea proposal mail (which you can read for yourself):
Another proposed change by Robert Mustacchi is increasing the value of IOV_MAX from 16 to 1024 (mail).
It raised some questions about breaking currentl applications from Garrett d'Amore:
My question here becomes one of concern about breakage & fallout that can occur with this change. Programs & libraries compiled with the old value may become incompatible with those built from the new. Therefore, I see this change as creating a potential flag day. I don’t see any work done here to mitigate this risk. Am I missing something? At a minimum, a transition period, with a message explaining what we are doing and the potential concerns for developers, and perhaps a way to get the old value (-DOLD_IOV_MAX ?) would probably be helpful. (Btw, the kernel bits can always enjoy the new larger value, so I think there isn’t a problem *there*, and I’m happy with the approach not to heap allocate unless a larger number of iovs is really needed.) (Out of curiosity, what programs have you seen that actually *use* more than 16 elements in the S/G array? It feels like 16 *ought* to be plenty for normal sane programs. The most S/G elements I’ve ever seen is 17, and that was for in-kernel DMA because the elements could be broken up on page boundaries, but iovs don’t need to worry about page alignment.
Robert answered that the change is actually following default for Linux, OSX and FreeBSD and gave QEMU and tmux as two examples of applications that break within lx brand with limit of 16.
I asked Robert for explanation on IRC and here's the answer:
16:43 <@rmustacc> madwizard: pong
16:47 < danmcd> ping alp and Woodstock
17:12 < madwizard> rmustacc: What does this IOV_MAX do?
17:12 <@rmustacc> Are you familiar with the concept of I/O vectors?
17:13 < madwizard> Not much.
17:14 <@rmustacc> So the traditional unix interfaces take a buffer and a length.
17:15 <@rmustacc> So, the signature of read or write is to take a file descriptor, a pointer to a buffer, and a length of that buffer.
17:16 <@rmustacc> Eventually, folks wanted to be able to read data into more than one buffer or write from more than one buffer using a single system call.
17:16 <@rmustacc> So they introduced the struct iovec.
17:16 <@rmustacc> Which has two members, the pointer to a buffer and a length for that buffer.
17:17 <@rmustacc> With that, new interfaces were introduced called readv and writev.
17:17 <@rmustacc> Which take a file descriptor, a series of struct iovec and a number of these iovecs.
17:18 <@rmustacc> Now, back in the day, they wanted to limit the number of such vectors.
17:19 <@rmustacc> So a macro IOV_MAX described the maximum number of such vectors that the kernel supports.
17:20 <@rmustacc> On most other platforms today that value was 1024.
17:20 <@rmustacc> On illumos, the value has historically been set to 16 because of the System V interface test suite.
17:20 < tsoome> so basically you get like records type file api instead of plain byte stream
17:20 < tsoome> ?
17:20 <@rmustacc> http://src.illumos.org/source/xref/illumos-gate/usr/src/uts/common/syscall/rw.c#611
17:21 <@rmustacc> tsoome: Not quite. You always consume the entirety of one of those buffers before going onto the next.
17:21 < tsoome> ok, i see
17:22 <@rmustacc> Put differently, if you were using readv for say UDP, you couldn't say put a UDP datagram in each of them.
17:22 <@rmustacc> It's really meant for doing scatter/gather I/O.
17:22 <@rmustacc> So for example, I want to read a header into one buffer and a payload into another.
17:22 < tsoome> yea
17:23 < patdk-wk> for say like, database reads, after you already searched the index and know what rows yo uwant
17:23 <@rmustacc> madwizard: I realize that's a bit and might not make the most sense, but let me know what questions you have.
17:23 < madwizard> rmustacc: I wanted to understand significance of this change in illumos
17:24 <@rmustacc> Well, it hasn't landed yet.
17:24 < madwizard> rmustacc: I suppose it has high chance?
17:24 <@rmustacc> I hope so.
17:24 <@rmustacc> But basically all the change itself is doing is raising that limit from 16->1024.
17:25 < tsoome> old limits from time with little memory and slow disks
17:25 <@rmustacc> Which brings what we have in line with other platforms.
17:25 < madwizard> Well, seeing as you mentioned qemu and tmux as apps failing in lx brand with current limit, I presume the change is something we would like to have
17:25 <@rmustacc> Well, the limit is for a bit more specific limit.
17:25 <@rmustacc> Well, QEMU fails natively because at least the version I had didn't check IOV_MAX at all. :/
17:25 < madwizard> rmustacc: Most code today is developed for Linux, I think
17:26 < tsoome> lot
17:26 <@rmustacc> But, raising the kernel limit will be required for lx.
17:26 < madwizard> So it makes sense to at least consider their policies
17:26 <@rmustacc> Regardless of whether or not we want to raise it for user applications.
Another proposed change from Joyent relayed to mailing list by Robert Mustacchi:
5886 and 5887 provide read only bootfs filesystem that allows to provide kernel with arbitrary objects during boot time and later on. Read the Redmine ticket (5886 want ability to privide additional objects at boot)for exemple uses. Sounds very interesting.
From the dtrace discuss mailing list: Nan Xiao wanted to pick up implementing dtrace probes for python (mail here), ut found only old blog entries from John Levon. He was pointed by Chris Ridd (mail) and Thijs Metsch (mail) to Metsch's work on getting dtrace probes upstreamed into python itself: https://www.jcea.es/artic/python_dtrace.htm
What I've prototyped and have found quite useful and successful in the varpd daemon are adding two new locking routines to the illumos lock interfaces in libc -- mutex_enter() and mutex_exit() that have the same semantics as in the kernel, except they abort a process as opposed to panic the system. They require that the lock be of type LOCK_ERRORCHECK and the interfaces don't attempt to do anything for robust mutexes, where you need to be paying attention to all the error codes anyways
So far Garrett d'Amore and Marcel Telka voiced their support for the idea.
On OpenIndiana Discuss Apostolos Syropoulos posted interesting thread about java plugin for solaris version of Oracle's JDK 8 being removed. You can read the thread here.
dmake integration into illumos gate - proposed by Richard Lowe
Argument in favor from his announce mail. As you are most porbably aware, ASLR is a security technique helping to prevent buffer overflow attacks. Working support in illumos would be indeed great.
Alex Wilson and -fstack-protector patch
Issue 5922: Want support for building with -fstack-protector is first in series of proposed changes to help against stack smash attacks. I caught it through this tweet. You can read whole annoucement to illumos-developer here. A short introduction from Alex's announce mail:
Short version: this flag tells the compiler to pick a certain subset of functions (explanation in comments) and add a "stack guard" or "canary" -- a magic value placed below the %rbp and return pointer, which is written at function entry and then checked again at function exit. If the value doesn't match, we do not return and instead jump straight into panic(). In this way, when a stack smashing attack is in progress, we avoid giving an attacker control of %rip and instead immediately crash in a safe manner.As explained in the issue itself, this is a flag in GCC compilers that is going to be set default by linux kernel compile process.
Toomas Soome filed an RFC for alternate boot loader.
Looking for some alternate way to boot illumos Toomas Soome started working on checking out FreeBSD loader. As we know, FreeBSD loader is capable of booting not only ZFS single and mirror but also RAID-Z configs. As Toomas wrote himself:
Old story, but from another angle. While grub2 is still one possible way to go, it has its cons (it has its own development goals which may or may not fit illumos needs, and possible licensing issues), so I have boxed it for time being and checked out about freebsd loader. While work on loader is nowhere near being done, I have managed to boot both 64 and 32 bit illumos with it (vmware environment), and got few ideas about integration. The current code can boot just from GPT labeled disk (pmbr + gptzfsboot bootblocks), adding up MBR+SMI is easy to add and just question of time.Seems like something to check out. It definately got noticed by others and got quite enthusiastic welcome from Garrett D'Amore:
Just to inject a few pennies here — this work is absolutely *awesome*. I’ve been wanting to look at doing this myself for *years* now. I’m so glad that you’re doing it Toomas. I think also the fbsd folks have figured out things like proper ZFS support in their boot loader, which is probably a lot cleaner/better than what its in Grub, even without the licensing “issues”.
New issues in May: 72, 26 are already closed.
Issues closed in May: 58