MPC Status PageThis page describes enhancements to and problems that have occurred with the MPC's webpages and scripts and the fixes that have been made. Resolved problems will remain in the Outstanding Problems sections for a brief period after resolution before being moved. This page will also try and give advance notice of the unavailability of these services due to network maintenance.
Known problems with contacting certain e-mail addresses are listed elsewhere.
Unless otherwise indicated, all times on this page are UTC times. Prior to 2010 Nov. 7, the times shown are local times (either Eastern Standard Time [UTC-5h] or Eastern Daylight time [UTC-4h], as appropriate).
Status Messages and Outstanding Problems
- June Minor Planet Circulars
2016 June 21: 10:59. The date on the June 20 batch of Minor Planet Circulars is incorrectly stated as June 5 (the date of the last mid-month MPS batch). This is a cosmetic issue, only.
- Planned Power Outage June 4 UT CANCELLED!
2016 June 2: 23:10: We have just been informed that there will a power outage at CDP, where our computers currently reside, which will last for several hours from 00:00 UTC June 4. Before this time, we will force the website onto the backup site, but processing will be shut down from about 19:30 UTC on June 3. Processing will resume as soon as possible after power is restored.
- June 3: 02:55. We are exploring the possibility of not having to shut down our current pipeline during this power outage (and running off battery backup). Since the networking components outside our intranet will be powered off, no processing of new batches will be possible, but it will mean that processing can restart as soon as power is restored to the complex.
- June 3: 16:32. The previously-mentioned plan can't be done as the battery backup has to disabled during this electrical work (!). We will have to shutdown AUTOACK at 22:55 UTC. The website will be failed over to the backup site around 23:00 UTC. The electrical work is expected to take 2-3 hours. Normal processing will be resumed as soon as possible after power is restored.
- June 3: 17:34. We have just been informed that this planned outage has been cancelled!
- MPES Problem
2016 May 23: 13:40: It seems the index and the data file for the unnumbered orbits in the MPES have got out of sync. They are being regenerated.
- 15:30: The regeneration of the necessary files has completed.
- Network Outage at CPD
2016 April 2: 13:18: Around 02:00 all network connections into and out of CDP failed. This prevented our processing pipeline from receving and processing observations and knocked the website offline. For reasons that are not yet clear, the website did not failover as intended to the mirror site. This visibility problem was alleviated by setting up "round robin" IP addresses to the various minorplanetcenter.net host names, which quickly restored access to our services via the mirror site.
It turns out the problem was caused by the failure of a firewall border router. This faulty unit was replaced around 13:00 and normal network connectivity resumed shortly thereafter. The backlog of submitted batches is being cleared.
- Some Incorrect Discovery Details for New Numberings
2016 March 24. Due to a production error, the discovery details for a number of the new numberings in the March 23 batch of MPCs are incorrect. The details will be corrected, hopefully, in the next batch of MPCs.
- No DOU MPEC Issued February 24
2016 February 24: 14:30. No DOU MPEC will be issued today, due to delays in filing the material from the Feb. 22 MPCs.
- Planned Interruption of MPC Network
2016 February 21: 14:30. We have been informed that the Computation Facility is planning patch installations on the firewall routes used at the CfA's three locations. The work is planned between 18:00 and 19:00 UTC today. We will lose network connectivity between machines outside and inside the CfA network, and the MPC website will be unavailable to outside users.
- Failover of cgi.minorplanetcenter.net
2016 February 18: 01:00. The script server cgi.minorplanetcenter.net has fallen over to the mirror site. The web server (which currently shares the same IP address) has not. We have hardcoded the IP address of the main script server into the NEOCP tabular webpage, until such time as the reason for this failover are understood and reversed.
- February 18: 03:20. We have reported the failback failure to the domain hosting company. We are awaiting a response. In the meantime, we have set up a rsync to copy the NEOCP files every minute from the main site to the mirror site.
- Sky Coverage Problems
2016 January 24: 23:00. A user pointed out that the Sky Coverage service was failing with "Internal Server Error". We have been investigating but cannot locate the problem. We noticed that a user (possibly in the Orem, UT region of the US) was using curl once a day (presumably via a cron job or similar) to get a sky coverage plot and that was working. We can confirm that direct curl calls to the script are continuing to work, but we do not know why that should be the case when the web interface is failing. We continue to investigate.
- January 26: 10:43. The Sky Coverage suddenly started working again. We have no idea why it started working again. If you find it not working again, please alert us immediately.
- No Minor Planet Circulars in January
2016 January 21: 22:10. There will be no batch of Minor Planet Circulars issued in January, due to concerns about a possible snow storm and the after-effects of a water main break at the observatory that may require interruptions in the internal network in the coming days.
- Incomplete File of Complete Numbered Minor Planet Observations
2015 December 28: 21:30. It seems that the complete numbereed minor-planet observation file is missing objects with numbers above 430000. The file is being regenerated.
- 22:55. The file has been regenerated and has been copied to the webserver.
- Changes in Some URLs
2015 November 5: 12.40. Over the next day or so, a couple of URLs will be changed. Automatic redirects will be put in place so that existing links on remote sites should be unaffected. Currently, the pages that will have new locations are the Distant Artificial Satellites Observation page and the list of PHAs.
- Problems Accessing the MPC Site
2015 October 25: 14:00. A number of users (initially, all the U.S., but now around the world) have reported that they cannot access the MPC website. A number of users have supplied traceroute output, all of which seem to show that the problem is a router/switch in the Harvard network. We have escalated this the Computation Facility.
- File of Isolated Tracklets Location Moved
2015 October 22: 17:00. The file of unlinked/unverified observations, known as Isolated Tracklets, has moved location and changed name. The file is now known as itf.txt. A link to the zipped version is available, as before, on the MPCAT-OBS page. If you have scripts that download this file automatically, you will need to modify them.
- Duplicate Entries in MPCORB.DAT file
2015 October 12: 17:39. The problem of duplicate entries appearing in the MPCORB.DAT file appears to have been fixed. The problem was due to the fact that the default behavior for the Linux sort command is a case-insensitive sort. The script that rebuilds the MPCORB.DAT at the start of the MPC month was lacking the command-line incantation necessary to make the sort case senititive. The nightly rebuild script had the necessary line. It has been added to the monthly build script and MPCORB.DAT has been rebuilt. Let us know if problems remain.
- AUTOACK Paused
2015 October 1: 20:27. We have paused the AUTOACK while we investigate a severe memory leak on our Linux compute cluster. The head node of our Rocks cluster (24 GB RAM) keeps running out of memory for no apparent reason. Until AUTOACK is restarted, no ACKs will sent out.
- October 2: 01:00. The memory problem apparently cleared itself, so we restarted AUTOACK, but the problem quickly resurfaced. We will have to examine this in the office tomorrow, so AUTOACK and various internal processes will have to be stalled until then. There will therefore be no DOU MPEC issued on Friday, October 2. This problem is also delaying the final preparation tasks and issuance of the Sept. 28 MPCs.
- 10:40. On our way into the office this morning, we will stop by CDP and shift compute nodes to a new head node. In the meantime, AUTOACK has been restarted, but will need to be shut down when the nodes are transferred.
- October 3: 13:00. We have identified the why the problem has been occurring and have fixed it. The problem was related to a shell invoked by a single user. What we don't understand is why this problem suddenly appeared a few days ago, as no patches have been installed recently. The MPC post-filing tasks can now continue and the Sept. 28 MPCs will appear as soon as possible.
- DB Search by Obscode Disabled
2015 September 20: 20:00. The DB search feature allowing extraction of observations within certain dates, optionally by a specific observatory code, has been disabled. There have been many instances of users requesting all observations made by the most prolific observatory codes, sometimes two identical requests at the same time. This attempt to download millions of observations, ordered by date, is grossly inefficient and clogs up our server, causing our web service to fail over to the mirror site since the IP monitor does not receive a timely response to its requests.
- Complete Observation Files
2015 September 2: 10:15. The complete observation files appear to be missing from the MPCAT-OBS service. We are investigating.
- The .gz files appear not to have been copied from the machine where they are prepared to the webserver. They are being copied now.
- MPEC 2015-P11
2015 August 8: 20:57. Due to a problem with access to a shared resource during the production of a recovery MPEC, the process stalled and caused the following DOU MPEC to have the same number. The simplest method to fix this is to retain the DOU MPEC as 2015-P11 and rename the recovery MPEC as 2015-P15. These changes have been made.
- Temporary Disabling of AUTOACK
2015 July 27: 15:12. We have temporarily disabled AUTOACK to invesigate a problem with one of our machines in the Rocks cluster. We restarted AUTOACK a short time later.
- MPChecker Problem
2015 July 16: 23:00. Starting around 10:00 UTC today, calls to the MPChecker script have been failing with the dreaded "Internal Server Error" message. We are investigating.
- July 17: 00:13. We believe we have located the problem and put a fix in place. It is worth remarking that ten hours elapsed from the problem's first appearance to the receipt of the first e-mail notifying us that there was a problem.
- Rejected Name Proposals
2015 July 6: 10:50. A large number of rejection notices for new name proposals have been sent out. A number of rejected proposals were submitted before the introduction of the web interface, so the system does not have the proposal associated with the e-mail address of the proposer. Since they will not receive the rejection message, the list of objects with rejected proposals is given here: (12582); (29449); (168234); (186499); (278141); (341413); and (365758).
- Missing Editorial Notice on July MPCs
2015 July 5: 14:50. We have just noticed that the Editorial Notice on MPC 94395 is missing. The missing text is given here: "There will be no batch of these circulars issued at the late July Full Moon, due to staff attendance at the IAU General Assembly. Mid-month batches will continue to be issued as normal."
- Harvard Mail Server Blocking MPECs
2015 May 21: 02:00. We have determined that the e-mailing of recent MPECs is being blocked by the Harvard mail hub, which has decided that we spamming! This is not the first time they have blocked us. We have put in a request to have the block removed.
- 14:30. The block has been removed. We have remailed circular K26 through K34.
- No Minor Planet Circulars in May
2015 April 10: 13:26. There will be no batch of Minor Planet Circulars issued in May due to the NASA Review of the MPC.
- Cfa Network Outage Scheduled for April 5
2015 March 27: 20:08. We have been informed that the Computation Facility will be upgrading the network firewall at CDP on April 5, between 17:00 UTC and 21:00 UTC. We will have no connection to the outside world during this upgrade. Web services at CDP will not be available, but the mirror site at Harvard EPS should be functional. We aim to have automatic failover for the website active before April 5, so the use of the mirror should be transparent to users (as long as you refer to the site via the hostname...).
- March 30: 17:10. We are temporarily disabling the AutoAck procedure in order to reconfigure some machines to ease the upcoming network outage.
- 18:47. AutoAck has been restarted.
- April 5: 01:15. Automatic failover for minorplanetcenter.net should be functional. We will pause automated processes starting around 16:30 UTC.
- Repeated MPEC 2015-F31
2015 March 19: 22:45. It seems that two MPECs were issued with the same circular number. This shouldn't happen, and it is unclear at the moment why it did happen. MPEC 2015-F31 is a new comet discovery and -F32 is a new NEO discovery. The versions on the web have been fixed.
- Cfa Network Outage Scheduled for March 23-24
2015 March 10: 17:00. We have been informed that the Computation Facility will be upgrading the CfA network firewall on March 23-24, between 17:00 UTC and 01:00 UTC. We do not yet know if this means that the MPC webservers will be unavailable. We do know that we will have no access into our machines from outside, neither will we be able to update the website.
- Temporary Disabling of AUTOACK
2015 March 7: 00:45. We have had to temporarily disable AUTOACK to fix a problem with the checking routines that are run on batches of unidentified observations. We restarted AUTOACK sometime later.
- Restrictions on Downloading Files
2015 February 25: 17:00. There are many users who download at intervals of minutes or seconds files that are changed only daily (or less frequently). The sheer number of such users is causing significant (and unnecessary) load on our servers. We have therefore implemented a scheme by which only one download of each file per 12 hours is permitted per IP address. Subsequent attempts to download the same file within 12 hours will return a blank file (unless the file in question has been updated since the last attempt). Due to the way that the scheme works, an aborted download will trigger the 12-hour block.
We have temporarily disabled this feature. It will return in a somewhat modified form in the near future.
- How do I tell if I'm downloading a file subject to restrictions? When you click on a link to get a file, if the URL is redirected to /download/transfer, then that file is subject to download restrictions described above.
- Although we will over time expand the number of files subject to download restriction, we do not plan to do this for the NEOCP files.
- If you believe that you are being blocked incorrectly, send a message to mpc@cfa stating the name of the file you are trying to access and the IP address of your machine (be sure that the IP address you give is an externally-visible address).
- Possible AUTOACK Delays
2015 February 9: 18:00. Some 24 very large observation batches are currently working their way through the AUTOACK pipeline and this is delaying the sending of acknowledgements. The backlog will be cleared after these very large batches finish processing.
- ECS Problem
2015 January 7: 18:31. A user has pointed out that the files associated with the Jan. 7 batch of Minor Planet Circulars are actually last month's files. An investigation has shown that the copying of those files to the last month directory worked, but then the copying of the new files failed because the files from last month were still present in the current month directory. This procedure worked last month (and before), so it is unclear why the behavior changed when the copying procedure was not altered.
19:35. We believe the problem with the MPCUPDATE files to be fixed. Please report
any anomalies to mpc@cfa in the normal fashion. We do not yet understand why the process
failed, other than suspecting that the syncing with the mirror site during the December
problems messed things up. We will keep an eye on the process next month.
- Incorrect Discovery Details in the January Minor Planet Circulars2015 January 7: 15:00. There are an (as yet unknown) number of incorrect discovery details for numbered minor planets, that should be covered by the MPEC 2010-U20, in the January batch of Minor Planet Circulars. This error is most perplexing as the bulk of the new numberings were entered in one operation on Dec. 27 and there are examples of both correctly-assigned and incorrectly-assigned details in objects inserted at that time. Either they should all be correct, or they should all be incorrect--hence our perplexion. Corrected details will appear in the Feb. Minor Planet Circulars. No name proposals will be accepted for objects with numbers between (415689) and (422636), inclusive, until this is resolved. Any such proposals that appear on the CSBN website will simply be deleted.
- 15:47. We now know why the misassignment occurred. The previous version of file of designations that were grandfathered into the old discovery assigment scheme was dated 2011 June 4. The latest version was dated 2014 July 11 and contains 207 designations that are later than the changeover date. The date suggests that the change occurred around the time of the preparation of a batch of MPCs, but it is difficult to see how the erroneous additions could have been added as the file is only read by the numbering procedure, never written. More problematic is that some grandfathered objects may have been subjected to the new rules. We are checking into that. A check is being added to the numbering process to ensure that the creation date of the latest version of the file is not after "14-JUN-2011 14:45:02.88".
- January 28: 16:30. Unfortunately, the corrected details will not be ready for the Feb. Minor Planet Circulars. We will endeavor to get them sorted out for the March circulars.
- Incorrect Discovery Details in the January Minor Planet Circulars
- No Minor Planet Circulars in December
2014 December 6: 20:00. There will be no batch of Minor Planet Circulars issued in December.
- Minor Change in NEOCP Posting Criterion
2014 December 5: 17:45. The digest2 score necessary for a new potential NEO candidate to be posted to the NEOCP has been raised from 50 to 65.
- Problems with CF DMZ
2014 November 29: 22:00. At some point this afternoon, the website went down. After some investigation, we believe the problem to be not with our webserver machine, but rather with the Computation Facility's DMZ in which the websever resides. We can access the webserver from other machines within the DMZ, but not from any machine outside.
- We have switched the mpc.cfa.harvard.edu address over to the mirror site. The website will remain on the mirror site until the DMZ problem is fixed. This change should be transparent to outside users.
- The CSBN name submission and member voting pages will have to be disabled until the web site is back on the regular machine.
- The cgi scripts are not functioning on the mirror machine. We are working on fixing this.
- November 30: 02:25. The NEOCP, MPChecker and MPEph should now be functional, with the following caveats: 1) graphical display of NEOCP uncertainties is not functioning (a blank page is returned, we are working on this issue); 2) uncertainty information and residual blocks in the MPEph are not available (the necessary data files do not exist on the mirror site and will not be available until the DMZ problem is fixed). The links to the designations associated with each object and the link to the naming citation are now functional. Other cgi scripts may or may not be functional. Scripts that call scully.cfa.harvard.edu will fail until such time as the CF repoints that host name to the mirror site (IP 18.104.22.168, in case you wish to hardcode an IP address into your scripts).
- Our Linux processing cluster also experienced issues today. We have rebooted the cluster. The issues caused a number of known objects to be posted to the NEOCP--these are being removed.
- A major issue with shifting the cgi scripts over to the mirror site has been the fact that the MPC does not control the scully.cfa.harvard.edu hostname. We therefore intend to migrate in the short-term to using cgi.minorplanetcenter.net as the hostname for scripts, since we control this hostname. Current scully.cfa.harvard.edu addresses will continue to work for the foreseeable future, but will not be altered if we need to switch to the mirror site again. Writers of scripts are advised to use cgi.minorplanetcenter.net when writing new scripts and to edit existing scripts.
- The processing cluster has been rebooted to clear problems that have caused extremely sluggish responses. The NEWNEO and NEOCP are being restarted, and the backlog of these types of objects will be then be cleared automatically. Once these are restarted, AUTOACK will be restarted.
- December 1 @ 06:30. Processing cluster is again sluggish. We believe we have identified the problem: a failing SCSI interace on one of the machines host the shadow sets. We are heading back in to CDP to replace the card.
- 07:25. We believe the processing cluster problems to be fixed. NEONEW and NEOCP processing will be back to normal, once the backlog is cleared. AUTOACK is functional. Processing of non-urgent observations (e.g, MBAs) will be restarted tomorrow (erm, later today...).
- Unexpected Machine Reboot Overnight
2014 November 25: 21:00. Early this morning, one of our cluster machines rebooted unexpectedly. Upon reboot, other machines in the cluster failed to remount one of the hosted shadow set disks. This stalled processing. A dirty cluster reboot had to be performed. NEO-related processes were brought back first. Some less urgent processes remain to be restarted. If you think a batch you submitted today has not been processed, please contact us. You should only resubmit if we ask you to.
- Suspension of ACKs
2014 November 19: 18:55. The AUTOACK process has been suspended while we investigate an odd problem with the compute cluster. E-mails sent to obs@cfa will be ACKed once we figure out the problem.
- 21:01. AUTOACK has been restarted.
- Suspension of ACKs
2014 October 30: 16:00. The AUTOACK process has been suspended, so that we can set up master-slave replication of our DBs with a remote system. We anticipate that this will take about two hours. All web services should continue to operate normally during this suspension.
- 17:18. DB setup has completed. AUTOACK has been restarted.
- Missing Information in MPES
2014 October 17: 01:39. It turns out that a large-capacity data disk on the webserver failed to mount following the recent power outage. Among other things, the disk contains the residual blocks and the datafiles necessary to generate uncertainty information. We will try and get it mounted as soon as possible.
- 14:04. The disk is now mounted. Residual blocks and uncertainty information are again available.
- Power Outage
2014 October 15: 05:27. There was an unplanned power outage in the computer room at CDP on October 14 @ 20:27. All MPC machines were affected. This is a repeat of the problem that caused the March 8 power outage. Power was restored shortly after 24:00. All automated processes have been restarted and the web site appears to be functional. One serious problem remains: no observational e-mail is being delivered to the processing queues. A query has been sent to the Computation Facility to track down why e-mail is not being delivered. Fortunately, e-mail is also delivered to a staff e-mail address, so the observations that have come in since the power outage are being submitted via the cURL route. This is a manual process. Observers are encouraged to use the cURL submission method, at least until the e-mail issue is resolved.
- October 15: 07:00. The power failed again in the CDP computer room!
- 10:00. Power was restored. Systems were restarted and the processing pipelines have been (mostly) restarted. It is possible that some observation batches may not have been submitted to the processing queues. If you are lacking an ACK for a batch, you have not received a status message for a new NEO candidate, or a NEO candidate that should have been posted wasn't, please contact us (making sure to mention which batch and designation you are referring to). Do not resend unless we ask to do so.
- MPEC mailing problem
2014 September 22: 23:40. It appears that the mail server we use has decided to flag our MPEC mailings as "suspicious" and is not sending them out. We will try and get this block removed.
- September 23: 01:35. We have received reports that NEOCP Blog postings are not being sent to subscribers who have requested e-mail delivery. Posts are appearing on the NEOCP Blog itself.
- 02:25. While we were investigating, the problem with the mail server disappeared and stacked-up e-mail began to be delivered, without us actually changing anything.
- Two MPEC 2014-Q39
2014 August 24: 19:10. Two MPECs, for different objects, were issued with the same MPEC number. The correct 2014-Q39 is the one listing the recovery of P/2005 Q4 (LINEAR), while the recovery MPEC for 2011 ON45 is now 2014-Q40. The web site is being updated with these corrections.
- NEOCP Updating Problem
2014 June 25: 00:30. A configuration problem has been causing updates to the NEOCP to fail for the past few hours. The fix for this is being worked on actively.
- 06:07. The problems have been fixed and the backlog of new NEO and NEOCP batches has been resubmitted to the processing queues. We believe that all batches have now been processed. Please report any missing NEO/NEOCP observations.
- PDF of 2014 June 13 MPCs Replaced
2014 June 17: 12:30. The PDF of the 2014 June 13 MPCs has been replaced, following an e-mail report that the new numbers were missing from one table in the circulars. There was an issue during the preparation of this batch that caused the numbers to be omitted from certain tables/files. The datafiles in the MPCUPDATE service were fixed earlier.
- June 24: 18:00. The file was again replaced to fix a minor issue.
- Known Objects Posting on NEOCP
2014 June 10: 15:40. An over-zealous cron job, designed to purge old files from a directory containing orbital elements for checking purposes, deleted a needed file. Until the check file was replaced, some known objects were posted on the NEOCP. Once the check file was replaced, subsequent follow-up observations were correctly identified. The known objects should now have been removed from the NEOCP.
- AUTOACK Issue
2014 June 9: 02:40. In an apparent repeat of the May 29 incident, the communication between two parts of the MPC processing pipeline broke for as-yet-unexplained reasons about an hour ago. The most visible outside effect of this outage is the lack of recent ACKs. We have identified which machine has the problem and a reboot is planned in the next 15 minutes or so.
- 03:31. The problem machine was rebooted. Normalcy has returned.
- Backlog of New NEO Processing
2014 May 29: 15:00. The communication between two parts of the MPC processing pipeline broke for as-yet-unexplained reasons late yesterday. This prevented, amongst other less important things, the processing of new NEO candidate batches. The loss of communication was transitory, lasting perhaps 20 minutes, but the effects lingered. The stalled processes have been restarted and the backlog of new NEO batches is being cleared. We are also looking at how to detect this problem as soon it occurs.
- MPEC 2014-H31 and Designation 2014 HL4 Abandoned
2014 April 24: 14:44. The fix that was supposed to prevent a repeat of today's problem with duplicate designations and MPECs did not work. The circular for 2014 HL4 has been abandoned (the next circular will use the same number) and the designation 2014 HL4 has been abandoned.
- MPEC 2014-H26 and Designation 2014 HF3 Abandoned
2014 April 24: 10:50. Due to a production issue with the recently-implemented fully-automatic preparation and issuance of MPECs, two circulars were issued for the same object under different designations. The circular for 2014 HF3 has been abandoned (the next circular will use the same number) and the designation 2014 HF3 has been abandoned.
- Temporary Removal of Further Observation Information in MPES
2014 April 17: 02:35. Following an e-mail report from an outside user about occasional inconsisent information displayed by 'Further Information' between the full and summary output in the MPES, we have temporarily disabled this output. It will return in the near future in an extended form.
- May 4: 13:12. The 'Further Information' link has been restored.
- Temporary Outage of AUTOACK
2014 April 16: 22:48. We need to reboot two machines in order to clear a hardware problem with a disk. One of the affected machines is the machine that accepts incoming observations. The exact time of the reboot is not yet known, but it will be soon. ACKs will not be sent out during this reboot process. The total downtime is hoped to be under 30 minutes.
- 23:30. The two machine have been rebooted, but an issue with ensuring consistency of the multiple shared disks in the cluster is preventing the machines fully rejoining the cluster. We are on-site and will restart processes as soon as the disk data integrity is ensured.
- April 17: 00:00. The data integrity checks have finished and the two rebooted machines have rejoined the cluster.
- Problem With Uncertainty Map Generation and Sky Coverage Plots
2014 March 20: 12:38. A user has alerted us to the fact that attempts to generate uncertainty maps and sky coverage plots are failing with the dreaded (and completely unhelpful) error message "internal server error". We are investigating why this error has suddenly appeared without any change being made in those processes.
- 13:02. The image generation problem is fixed. A security update trashed a softlink needed by the graphics library. A workaround is being put in place to try and prevent this happening again.
- Power Problem in CDP Computer Room
2014 March 8: 15:00. The power has failed in the CDP computer room where the MPC's computers reside.
- 18:00. Power has been restored. Website should be back shortly. Restarting processing pipelines will take longer.
- 23:00. Normalcy should now be restored.
- E-Mail Issues
2014 February 27: 15:00. With zero notice, the CF has made a change in the delivery of e-mail addressed to obs@cfa. This change has resulted in no e-mails being delivered to the automated processes. This means that incoming batches are not being ACKed. We are in communication with the CF to get this resolved ASAP or the former behavior of the e-mail sustem restored.
Please do not resend observation batches. Any batches that arrived during the "problem" will be forward to obs@cfa when normalcy returns.
- 17:00. Some semblance of normalcy has returned. ACKs are being sent out and automated processes are chugging away. Batches that arrived during the "outage" have been resubmitted.
- February 28: 03:00. There may still be some lingering problems. We are seeing unusual periods of inactivity, where no observation batches come in.
- 04:00. After submitting a batch of observations, if you see a reponse informing you that "some recipients of email@example.com might not receive your message" and you received the normal ACK, you can ignore the warning. This is some drivel that Google Mail is spouting and we haven't yet figured out how to turn off this "feature".
- Network Problems
2014 February 18: 17:00. There were major networking issues amongst some of our computers over the past 24 hours. These had a major impact on some internal operations. We believe the bulk of the problems have been resolved. Batches that have not been processed will be resubmitted for processing.
- Problem With Uncertainty Map Generation and Sky Coverage Plots
2014 February 9: 22:30. A user has alerted us to the fact that attempts to generate uncertainty maps and sky coverage plots are failing with the dreaded (and completely unhelpful) error message "internal server error". We are investigating why this error has suddenly appeared without any change being made in those processes.
- 23:05. The image generation problem is fixed. The setup of a shared NFS directory apparently deleted a soft link that was needed by the graphics library. Why that one soft link, out of many in the directory, affecting an application that has no connection to the NFS shared directory, should have been deleted is unknown.
- NEAObs Problems Mostly Fixed
2014 February 9: 14:00. Recent problems with stale data in the NEAObs service have been resolved. Some more generation of data will be necessary over the next few days, and we are looking at porting the entire procedure that generates the data need onto newer hardware.
- Problem With E-Mail Delivery Overnight
2014 January 6: 12:50. The e-mail interface between Google Mail (the e-mail system in use at CfA) and the MPC apparently stopped working between about 05:00 and 10:00 UTC this morning. Essentially no e-mail arrived at the MPC during this period. We have contacted the Computation Facility to try and find out why this happened.
- 14:10. The CF is investigating and have shutdown @cfa addresses while they do this check. No word yet on when normal service will resume.
- 16:11. Although we have had no word from the CF yet, observation batches are being delivered without any delays. It is unclear if the backlog of non-delivered messages has been cleared yet.
- Problem with MPEC 2013-Y09
2013 December 18: 21:30. A problem with the filing of the newly-referenced orbits from MPEC 2013-Y09 has been identified. The orbits will be republished on the next DOU MPEC and will be accessible in, e.g., the MPES after that publication.
- Website Issue
2013 December 10: 21:50. A problem with the MPC's webserver required us to shut down the server for a while earlier today. When the system was brought back up, access to cgi scripts continued to be disabled for outside sites for a while longer, to allow us to diagnose the issue. The problem has been identified and a fix is in place. Access to cgi scripts has been restored.
- DOU MPECs
2013 November 11: 18:00. Automatic issuance of DOU MPECs will be suspended temporarily while we reconfigure the procedure. There will be manual issuance of DOU MPECs at odd time over the next few days.
- NEOCP Batch Processing
2013 November 8: 23:00. Due to a recent incident of gross misidentification of an NEOCP object by an observer, NEOCP batches will henceforth be passed through the NEWNEO procedure to try and minimize similar identification problems in the future. Observers will therefore see reports that were previously sent only upon submission of a new NEO report.
- NEOCP Problems Overnight
2013 November 7: 15:45. An over-zealous cron job deleting old temporary files ignored the "do not delete this file" protection on the script that generates the variant orbits and deleted it. Needless to say, this confused the NEWNEO procedure and caused newly-reported NEO candidate to not be posted. We have replaced the deleted procedure, modified the delete cron job to be more selective, and resubmitted the batches that failed.
- Sluggish Webserver Response
2013 October 28: 13:40. We have received reports that certain web services have been unresponsive or non-function this morning. We have determined that the load on the webserver machine is extremely high, due to a number of sites sending hundreds of requests per minute to our various services. The offending IP addresses have been blocked.
- 14:18. Problems persist, so a webserver reboot is being performed.
- Network Problems
2013 October 6: 17:00. There are networking issues amongst some of our computers. These are having a major impact on some internal operations. Replacement of at least one network card is planned for the near future. Issuance of MPECs will be delayed and last night's DOU MPEC has been abandoned.
- 21:50. An interface card has been replaced. Processes have been restarted. Various backlogs have been cleared. MPEC T39 is the recovery MPEC for 2009 VA1, not the Oct. 6 DOU MPEC.
- Plotting Issue
2013 August 4: 16:15. We have identified a difference in the environment on the new webserver (which also serves as the cgi server and the DB server) when compared to the old webserver. This difference is causing dynamically-generated plots (and other temporary files) to be written to the wrong directory. This is being fixed.
- 18:00. The temporary directory issue has been fixed.
- August 6: 19:00. A (partial) fix for the PNG display of sky coverage charts and uncertainty maps in in place. The sky coverage charts are missing the textual information and the uncertainty-map grids currently show no points. The missing font problem is being looked at.
- August 7: 14:50. Copying the PGPLOT font file from the old app server to the new server, and repointing one environment variable, returned fonts to the images generated by the sky coverage and uncertainty map scripts! The generation of NEOCP uncertainty plots has been moved back to the new server.
- Mail Issues
2013 August 4: 23:13. Four of the cluster machines are experiencing problems with name resolution and e-mail. One machine that was sending email at 21:30 is no longer. The reasons for this are unclear, but it is having an impact on the operation of a number of automated routines. We are investigating.
- August 5: 14:00. After attempting to modify the existing configuration, we gave up and reconfigured the TCPIP software from scratch on all our machines. We believe all issues to be fixed.
- Some CGI Issues on the New Web Server
2013 July 31: 00:07. Dynamic image generation, as used in the cgi scripts that return uncertainty maps or sky coverage plots, are currently off-line for two reasons. The first problem was incorrect protection on the temporary directory used to store the images--this is fixed. The other problem is that PGPLOT graphics library is not installed. This is being worked on, but the automatic installation procedure is apparently not working (why am I not surprised...), so the installation may have to be done manually.
- 14:00. The PGPLOT library is installed, but the required PNG support is disabled, due to an incompatability between the PGPLOT library and the libpng library used for PNGs. This incompatability occurs on newer versions of the OS. Attempts to switch back to GIF images generated segfaults, so there may be incompatabilities here, too. We are investigating our options.
- Aug. 1: 11:33. We are trying to get the uncertainty maps for NEOCP working on the old app server. This requires a sync between the new and old machines every time the NEOCP is updated. All the necessary processes are written, but there is a protection problem with the sync process. We are trying an alternate approach.
- Aug. 1: 11:50. Textual information on NEOCP uncertainties is still functional, as is PS output of Sky Coverage plots.
- New Web Server
2013 July 29: 21:00. A new look MPC website on a new web server is being brought on-line today. The internal changes necessary to copy material to the new web server machine are complete. The switchover of the DNS records to allow external users transparent access to the new site will be done in about 90 minutes. It may take some hours for the DNS changes to propagate to all DNS servers.
- July 30: 00:50. The DNS records were updated around 22:30. The change was visible on home systems within roughly 30 minutes. The DNS record for the alias scully, which points to the script server, has to be updated by the Computation Facility. A request was put in a few hours ago, but it is unknown when the change will be made.
- Physical Move for Some MPC Computers
2013 July 24: 15:30. The MPC computers that remain at 60 Garden Street will be physically moved to CDP late on Thursday, July 25. Automated procedures will be shut down some time prior to the (as yet unknown) move time. An update will appear here when the move is completed.
- July 25: 13:45. The automated processes are being shutdown in preparation for the physical move. Patch installation will be performed before the move.
- July 26: 00:30. The patches have been installed, the systems shutdown, transported over to CDP, reconnected, powered up and the cluster has been reformed. Unfortunately, there appears to be a network issue related to the new IP addresses necessitated by the move that prevents our machine talking to any machine outside the cluster. There is nothing we can do to fix the problem. It will require the involvement of the CF and that will not happen before the morning.
- July 26: 20:40. The network problems have been resolved (incorrect configuration information supplied by the CF). We are doing some checking before restarting the automated processing routines.
- July 26: 21:20. It turns out all our key authentications are now broken due to required assignment of new IP addresses. This will need fixing before we can restart any automated processes.
- July 27: 02:00. Automated processes are being restarted.
- July 27: 13.00. Most automated processes are restarted. DOU MPEC should resume tonight.
- Two Incorrect Numbered Orbits
2013 July 24: 15:30. Following a query about non-NEOs being flagged as NEOs, we have determined that the orbits for (103790), published on MPS 266361, and (105071), published on MPS 266365 refer to the numbered objects (363790) and (365071), respectively. The problem arose because both these objects have radar observations and, at the time these orbits were prepared, the radar-orbit preparation routine did not handle designations above (359999) correctly. The radar-orbit preparation routine has been fixed and new orbits will appear on the next DOU MPEC
- Planned Power Interruption at CDP
2013 July 8: 19:05. We have been informed that there will be a power outage at CDP on Wednesday, July 10, in order to repair/replace electrical circuit breaker surge protection equipment. This is related to the May 16 issue. Computer systems, including our public-facing systems, will be shutdown starting at 20:30 UTC and power should be restored by 01:00 UTC on July 11.
- Web Site Issues
2013 July 3: 12:30. It appears that the MPC webserver is unavailable. A large number of ruby jobs (presumably doing DB updates) are running and this is clogging up the server. The script server (aliased as scully.cfa.harvard.edu) is working normally. We are investigating.
- Misassignment of Minor Planet Names in June
2013 June 26: 12:48. Due to a production error, many of the new names accepted in the June Minor Planet Circulars were assigned to the wrong minor planet. The names will be republished in the July Circulars.
- Unexpected Power Interruption May 16
2013 May 16: 16:00. We had one hour's notice that power to the building where part of our computer system resides would be turned off to allow engineering work. We were able to stop the automated processes and dismount the shadow sets, but were unable to stop the machines restarting when power was restored. This uncontrolled power up has caused major problems to the cluster. One machine is off line (service has been called) and one external disk box is not sharing its disks (one member of each of four three-member shadow sets). No data has been lost, but until the affected machine has been repaired, certain internal tasks may be delayed.
- One feature that we cannot easily replicate while the off-line machine is unavailable is complete datasets necessary for the various MPChecker facilities. Our internal checking routines are not affected by this problem.
- There is a workaround in place for the MPChecker services. Note that full functionality is not yet back in place as the service is missing the nearby-epoch data for NEOs and the historical data for all objects.
- The May Minor Planet Circulars batch will not be issued.
- Network Interruption May 14
2013 May 14: 16:00. We have been reminded that the main network router at 60 Garden Street will be upgraded during the time period 21:30 UTC on May 14 to 01:30 UTC on May 15. MPC webservices should remain accessible during this network outage, but updating of pages will not be possible due to internal network disruption.
- Potential Service Interruptions May 18
2013 May 14: 13:25. We have been informed that Computation Facility services will be unavailable between 13:00 and 18:00 UTC on Saturday, May 18. CF webservices and e-mail service will be unavailable during this period. The MPC webservices (and the updating of data used by the services) should be unaffected by this downtime.
- SKYCOV Restored
2013 Apr. 8: 16:26. The SKYCOV service has been restored to full functionality and sky coverage data is again available up to the current date.
- SKYCOV and DELCODE Removals
2013 Apr. 6: 01:52. We are now looking at repairing the SKYCOV service. This will also fix the problem of the automatic removal of DELCODE requests. In the meantime, the script that validates DELCODE requests for removal will forward the request to a staff member for removal.
- Apr. 7: 13:45. Automated removal of DELCODE requests was reactivated last night. Please report any problems with removals.
- Apr. 7: 15:45. The SKYCOV service is almost restored. Batches of sky coverage are being copied to a temporary directory on the app server. We just need to move those copied files to their final locations and rebuild the catalogue of available files. These tasks will be done tomorrow.
- File Protection Problems on Web Server
2013 Mar. 27: 12:15. Following the updating over the past two nights of various files on the web server machine, it appears that the transfer process has suddenly started disabling world/other read permission on two of the many files that it copies. This is annoying as it prevents the MPES from running. This problem has occurred in the past, for different files over varying periods of time. We are investigating the best way to locate problem files and reset the permissions if found to be wonky.
- 12:45. The potential source of the problem has been identified and a fix has been implemented. As to why only two of four transferred files (all unpacked into the same directory) were affected by this problem, we have no idea!
- AUTOACK Problem
2013 Mar. 25: 20:32. The machine that sends out ACKs and processes incoming e-mails is having hardware issues. We are investigating and attempting to move processes to another machine.
- Mar. 26: 16:31. A rebuild of the process was required. A functioning version of AUTOACK is now running on another machine. Some recent observations batches have been ACK'ed by this new routine running in test mode. Other observation batches that arrived after Mar. 25 17:16:21 UTC and before March 26, which were processed manually during the outage, will be forwarded to AUTOACK. Observers may receive two (or more) ACKs for messages. A semblence of normalcy has been restored, further work will be necessary to add functionality to the new AUTOACK process.
- Mar. 26: 14:10. The SKYCOV service has also been affected. A partial fix to this problem is forthcoming (accepting of incoming e-mails), but a full fix will have to wait until after the March Minor Planet Circulars are prepared.
- Mar. 26: 20:14. Extensive testing of the new routine showed up some problems, which have all been fixed. Automated ACKs have resumed.
- E-mail Submission of Observations Via mpc@cfa Shut Off
2013 Mar. 4: 13:32. As noted on MPEC 2013-A21, the e-mail address mpc@cfa can no longer be used for submission of observations. E-mail submission of observations must be via obs@cfa. Any future submissions of observations to mpc@cfa will not be acknowledged. The e-mail address mpc@cfa will continue as a general contact address for the MPC.
- No Mid-Month MPS Batch
2013 Feb. 8: 23.55. As a direct result of this weekend's major snow storm in New England and the consequent major risk of extended power outages, there will be no mid-month MPS batch issued this weekend.
- Problem With MPES
2013 Jan. 30: 15:14. It appears that there is a problem with one (or more) of the data files used by the MPES. We are investigating.
- 16:14. The problem with the MPES data files has been located and fixed.
- Intermittent Network Problems
2013 Jan. 9: 22:40. We have received a report from our CF that Harvard is investigating intermittent network connectivity problems. We have already received a report of a script being temporarily inaccessible.
- Extended Power Outage at CDP
2013 Jan. 3: 15:01. There was a four-hour power outage at CDP between 07:30 and 11:20 this morning. The website and scripts should again be visible. Internal operations are being restarted as necessary.
- 21:10. We have been informed that the power issue this morning fried one of the main electrical power fuses in the computer room. It will be replaced at 22:00 on Monday, January 14. This will require shutting off the power. The downtime is expected to be under 30 minutes, but one hour is the announced timeframe. Website and scripts will be off-line from around 21:30 until power is restored.
- Archive of older enhancements and problems/resolutions