Ever run into this one? Under Leopard (OSX 10.5.8), the above message is posted to the system console every time cron launches a process. If you use cron for anything and the console for anything, this can be very, very annoying. Apple’s known about the problem for years, and has done nothing. Since I fall into that class of folks who use both the console and cron, I’ve been more or less quietly steaming about it. I finally decided to do something about it. Be warned; the following is technical and to-the-metal hackery. Don’t try this at home unless you’re very confident in your skill(z).
Now, as this hack is fairly specific to the problem, I don’t think it can hurt anything else (if performed correctly), and it serves my needs perfectly. Your mileage may differ for any number of reasons, and if you’re not a technical person, you should stop reading now and just forget your ever saw this post. Really. Stop now.
No? Still here? Ok, then…
Before you start, make a copy of the unmolested version of launchd, or you may be very, very sorry. I would also recommend that this only be undertaken if you have a second Mac around that you can use to get at the HD of your machine via FireWire if you foul this up and it fails to reboot — because if launchd won’t run, neither will much of anything else. You’ve been warned: this has to be done exactly right, or you may find yourself bringing your machine back up from the stone age, and cursing everything in sight.
Still here? LOL. Ok then, you asked for it:
Searching for the error message, or fragments of it, through the 10.5.8 OSX source code at http://www.opensource.apple.com/ eventually turns up launchd_core_logic.c:
kern_return_t job_mig_post_fork_ping(job_t j, task_t child_task) { struct machservice *ms; if (!launchd_assumes(j != NULL)) { return BOOTSTRAP_NO_MEMORY; } job_log(j, LOG_DEBUG, "Post fork ping."); job_setup_exception_port(j, child_task); SLIST_FOREACH(ms, &special_ports, special_port_sle) { if (j->per_user && (ms->special_port_num != TASK_ACCESS_PORT)) { /* The TASK_ACCESS_PORT funny business is to workaround 5325399. */ continue; } errno = task_set_special_port(child_task, ms->special_port_num, ms->port); if (errno) { int desired_log_level = LOG_ERR; if (j->anonymous) { /* 5338127 */ desired_log_level = LOG_WARNING; if (ms->special_port_num == TASK_SEATBELT_PORT) { desired_log_level = LOG_DEBUG; } } job_log(j, desired_log_level, "Could not setup Mach task special port %u: %s", ms->special_port_num, mach_error_string(errno)); } } job_assumes(j, launchd_mport_deallocate(child_task) == KERN_SUCCESS); return 0; }
So basically, as can be determined by perusing the above posting of the relevant chunk of source code from launchd, there’s a function call that reports this particular error. And pretty much just this particular error.
It has four parameters, one of which is returned from a call to mach_error_string()
. The only parameter that is a pointer (carrying the implication that the called routine could somehow get back at the original parameter) is the string pointer, and self-modification of a static format string… nah. That’s not the kind of thing the serious Koolaid drinkers at Apple would ever do. So it is clear that called procedure isn’t doing anything to those parameters; they are just used to report the error if it occurs — because if the error doesn’t occur, the call isn’t made. So, we don’t need to make the call for this routine to function correctly. Ah-ha.
This procedure call to mach_error_string
turns out to be a unique signature in the 10.5.8 version of launchd — it only occurs once in the source code. That in turn allowed me to precisely locate the exact binary machine code within the launchd executable on my machine using a disassembler.
Once there, it was pretty obvious what is going on; there were the appropriate number of move instructions for the number of parameters for the call plus a little bit of indirect parameter retrieval, then the call, then the loop is checked and either exited or re-run. Here’s the relevant portion of the disassembly:
+193 0000fcd9 e87e050100 calll 0x0002025c _mach_error_string +198 0000fcde 89442410 movl %eax,0x10(%esp) +202 0000fce2 8b462c movl 0x2c(%esi),%eax +205 0000fce5 c7442408b8b90100 movl $0x0001b9b8,0x08(%esp) +213 0000fced 895c2404 movl %ebx,0x04(%esp) +217 0000fcf1 893c24 movl %edi,(%esp) +220 0000fcf4 c1e808 shrl $0x08,%eax +223 0000fcf7 25ff030000 andl $0x000003ff,%eax +228 0000fcfc 8944240c movl %eax,0x0c(%esp) +232 0000fd00 e857afffff calll 0x0000ac5c
The binary signature of the call (again, I emphasize, in the 10.5.8 version of launchd on my machine) is: E857AFFFFF
. There’s only one instance of this binary string in the entire file, so again, an easy marker.
What we need to do here is replace that with something harmless (in context) of the same length. nearby, there’s an immediate AND
instruction for eax
; that’ll do. The value in eax
isn’t used again as the loop ends after the call. The binary for that AND
instruction is: 25FF030000
. Even if, worst case, at load time, the code were relinked so that (what was) the calll
address was changed, all that would happen is the and instruction would have a different AND
pattern. So no matter what, this should be ok to do.
So we replace the E857AFFFFF
with 25FF030000
, and now what the code does is load up those parameters, AND the eax
register to no point at all, and go on with life. No more “Could not setup Mach task special port %u: %s” messages. Ever. And launchd will continue to work just like it always did, because all that has been done here is to excise a call to log an error message.
Now… just a couple closing remarks.
First, although the hack is quite specific, it isn’t quite what I’d call surgically precise; it is possible that there might be lurking somewhere a situation that would legitimately call for this message, or a message emitted using this format string and parameters with a different port number and/or string at the end; and that’s not going to be logged if it happens. So be aware of that. Error handling, such as it is, won’t change, but the logging… that’s now impossible.
Second, this is hacking in the classic sense; for me, it was both fun and very satisfying, serving the purpose of raising my central digit in Apple’s general direction for inconveniencing me with something they could have easily fixed themselves in a matter of seconds; for you, though, if you’re not comfortable with machine code and binary, and/or not very certain you can do this exactly right… and you’re not completely prepared to have to firewire into your machine and replace the hacked launchd with the original again… or you don’t have 10.5.8… don’t even try it. Just write Apple and tell ‘em to fix their broken launchd.
So… Apple gives us a bug; drops the ball on fixing it; I entirely lost patience and hacked it out of my face. And there you have it. Pbbffft.