MPC Status PageThis page describes enhancements to and problems that have occurred with the MPC's webpages and scripts and the fixes that have been made. Resolved problems will remain in the Outstanding Problems sections for a brief period after resolution before being moved. This page will also try and give advance notice of the unavailability of these services due to network maintenance.
Known problems with contacting certain e-mail addresses are listed elsewhere.
Unless otherwise indicated, all times on this page are UTC times. Prior to 2010 Nov. 7, the times shown are local times (either Eastern Standard Time [UTC-5h] or Eastern Daylight time [UTC-4h], as appropriate).
Status Messages and Outstanding Problems
- Unexpected Machine Reboot Overnigt 2014 November 25: 21:00. Early this morning, one of our cluster machines rebooted unexpectedly. Upon reboot, other machines in the cluster failed to remount one of the hosted shadow set disks. This stalled processing. A dirty cluster reboot had to be performed. NEO-related processes were brought back first. Some less urgent processes remain to be restarted. If you think a batch you submitted today has not been processed, please contact us. You should only resubmit if we ask you to.
- Suspension of ACKs
2014 November 19: 18:55. The AUTOACK process has been suspended while we
investigate an odd problem with the compute cluster. E-mails sent
to obs@cfa will be ACKed once we figure out the problem.
- 21:01. AUTOACK has been restarted.
- Suspension of ACKs
2014 October 30: 16:00. The AUTOACK process has been suspended, so that we
can set up master-slave replication of our DBs with a remote system.
We anticipate that this will take about two hours. All web services
should continue to operate normally during this suspension.
- 17:18. DB setup has completed. AUTOACK has been restarted.
- Missing Information in MPES
2014 October 17: 01:39. It turns out that a large-capacity data disk on the
webserver failed to mount following the recent power outage. Among other
things, the disk contains the residual blocks and
the datafiles necessary to generate uncertainty information. We will
try and get it mounted as soon as possible.
- 14:04. The disk is now mounted. Residual blocks and uncertainty information are again available.
- Power Outage
2014 October 15: 05:27. There was an unplanned power outage in the computer
room at CDP on October 14 @ 20:27. All MPC machines were affected.
Power was restored shortly after 24:00. All automated processes have
been restarted and the web site appears to be functional. One serious
problem remains: no observational e-mail is being delivered to the
processing queues. A query has been sent to the Computation Facility
to track down why e-mail is not being delivered. Fortunately, e-mail
is also delivered to a staff e-mail address, so the observations that
have come in since the power outage are being submitted via the cURL
route. This is a manual process. Observers are encouraged to use
the cURL submission method, at least until the e-mail issue is resolved.
- October 15: 07:00. The power failed again in the CDP computer room!
- 10:00. Power was restored. Systems were restarted and the processing pipelines have been (mostly) restarted. It is possible that some observation batches may not have been submitted to the processing queues. If you are lacking an ACK for a batch, you have not received a status message for a new NEO candidate, or a NEO candidate that should have been posted wasn't, please contact us (making sure to mention which batch and designation you are referring to). Do not resend unless we ask to do so.
- MPEC mailing problem
2014 September 22: 23:40. It appears that the mail server we use has decided
to flag our MPEC mailings as "suspicious" and is not sending them out.
We will try and get this block removed.
- September 23: 01:35. We have received reports that NEOCP Blog postings are not being sent to subscribers who have requested e-mail delivery. Posts are appearing on the NEOCP Blog itself.
- 02:25. While we were investigating, the problem with the mail server disappeared and stacked-up e-mail began to be delivered, without us actually changing anything.
- Two MPEC 2014-Q39
2014 August 24: 19:10. Two MPECs, for different objects, were issued with the same MPEC number. The correct 2014-Q39 is the one listing the recovery of P/2005 Q4 (LINEAR), while the recovery MPEC for 2011 ON45 is now 2014-Q40. The web site is being updated with these corrections.
- NEOCP Updating Problem
2014 June 25: 00:30. A configuration problem has been causing updates to the NEOCP to fail for the past few hours. The fix for this is being worked on actively.
- 06:07. The problems have been fixed and the backlog of new NEO and NEOCP batches has been resubmitted to the processing queues. We believe that all batches have now been processed. Please report any missing NEO/NEOCP observations.
- PDF of 2014 June 13 MPCs Replaced
2014 June 17: 12:30. The PDF of the 2014 June 13 MPCs has been replaced, following an e-mail report that the new numbers were missing from one table in the circulars. There was an issue during the preparation of this batch that caused the numbers to be omitted from certain tables/files. The datafiles in the MPCUPDATE service were fixed earlier.
- June 24: 18:00. The file was again replaced to fix a minor issue.
- Known Objects Posting on NEOCP
2014 June 10: 15:40. An over-zealous cron job, designed to purge old files from a directory containing orbital elements for checking purposes, deleted a needed file. Until the check file was replaced, some known objects were posted on the NEOCP. Once the check file was replaced, subsequent follow-up observations were correctly identified. The known objects should now have been removed from the NEOCP.
- AUTOACK Issue
2014 June 9: 02:40. In an apparent repeat of the May 29 incident, the communication between two parts of the MPC processing pipeline broke for as-yet-unexplained reasons about an hour ago. The most visible outside effect of this outage is the lack of recent ACKs. We have identified which machine has the problem and a reboot is planned in the next 15 minutes or so.
- 03:31. The problem machine was rebooted. Normalcy has returned.
- Backlog of New NEO Processing
2014 May 29: 15:00. The communication between two parts of the MPC processing pipeline broke for as-yet-unexplained reasons late yesterday. This prevented, amongst other less important things, the processing of new NEO candidate batches. The loss of communication was transitory, lasting perhaps 20 minutes, but the effects lingered. The stalled processes have been restarted and the backlog of new NEO batches is being cleared. We are also looking at how to detect this problem as soon it occurs.
- MPEC 2014-H31 and Designation 2014 HL4 Abandoned
2014 April 24: 14:44. The fix that was supposed to prevent a repeat of today's problem with duplicate designations and MPECs did not work. The circular for 2014 HL4 has been abandoned (the next circular will use the same number) and the designation 2014 HL4 has been abandoned.
- MPEC 2014-H26 and Designation 2014 HF3 Abandoned
2014 April 24: 10:50. Due to a production issue with the recently-implemented fully-automatic preparation and issuance of MPECs, two circulars were issued for the same object under different designations. The circular for 2014 HF3 has been abandoned (the next circular will use the same number) and the designation 2014 HF3 has been abandoned.
- Temporary Removal of Further Observation Information in MPES
2014 April 17: 02:35. Following an e-mail report from an outside user about occasional inconsisent information displayed by 'Further Information' between the full and summary output in the MPES, we have temporarily disabled this output. It will return in the near future in an extended form.
- May 4: 13:12. The 'Further Information' link has been restored.
- Temporary Outage of AUTOACK
2014 April 16: 22:48. We need to reboot two machines in order to clear a hardware problem with a disk. One of the affected machines is the machine that accepts incoming observations. The exact time of the reboot is not yet known, but it will be soon. ACKs will not be sent out during this reboot process. The total downtime is hoped to be under 30 minutes.
- 23:30. The two machine have been rebooted, but an issue with ensuring consistency of the multiple shared disks in the cluster is preventing the machines fully rejoining the cluster. We are on-site and will restart processes as soon as the disk data integrity is ensured.
- April 17: 00:00. The data integrity checks have finished and the two rebooted machines have rejoined the cluster.
- Problem With Uncertainty Map Generation and Sky Coverage Plots
2014 March 20: 12:38. A user has alerted us to the fact that attempts to generate uncertainty maps and sky coverage plots are failing with the dreaded (and completely unhelpful) error message "internal server error". We are investigating why this error has suddenly appeared without any change being made in those processes.
- 13:02. The image generation problem is fixed. A security update trashed a softlink needed by the graphics library. A workaround is being put in place to try and prevent this happening again.
- E-Mail Issues
2013 February 27: 15:00. With zero notice, the CF has made a change in the delivery of e-mail addressed to obs@cfa. This change has resulted in no e-mails being delivered to the automated processes. This means that incoming batches are not being ACKed. We are in communication with the CF to get this resolved ASAP or the former behavior of the e-mail sustem restored.
Please do not resend observation batches. Any batches that arrived during the "problem" will be forward to obs@cfa when normalcy returns.
- 17:00. Some semblance of normalcy has returned. ACKs are being sent out and automated processes are chugging away. Batches that arrived during the "outage" have been resubmitted.
- February 28: 03:00. There may still be some lingering problems. We are seeing unusual periods of inactivity, where no observation batches come in.
- 04:00. After submitting a batch of observations, if you see a reponse informing you that "some recipients of firstname.lastname@example.org might not receive your message" and you received the normal ACK, you can ignore the warning. This is some drivel that Google Mail is spouting and we haven't yet figured out how to turn off this "feature".
- Network Problems
2013 February 18: 17:00. There were major networking issues amongst some of our computers over the past 24 hours. These had a major impact on some internal operations. We believe the bulk of the problems have been resolved. Batches that have not been processed will be resubmitted for processing.
- Problem With Uncertainty Map Generation and Sky Coverage Plots
2014 February 9: 22:30. A user has alerted us to the fact that attempts to generate uncertainty maps and sky coverage plots are failing with the dreaded (and completely unhelpful) error message "internal server error". We are investigating why this error has suddenly appeared without any change being made in those processes.
- 23:05. The image generation problem is fixed. The setup of a shared NFS directory apparently deleted a soft link that was needed by the graphics library. Why that one soft link, out of many in the directory, affecting an application that has no connection to the NFS shared directory, should have been deleted is unknown.
- NEAObs Problems Mostly Fixed
2014 February 9: 14:00. Recent problems with stale data in the NEAObs service have been resolved. Some more generation of data will be necessary over the next few days, and we are looking at porting the entire procedure that generates the data need onto newer hardware.
- Problem With E-Mail Delivery Overnight
2014 January 6: 12:50. The e-mail interface between Google Mail (the e-mail system in use at CfA) and the MPC apparently stopped working between about 05:00 and 10:00 UTC this morning. Essentially no e-mail arrived at the MPC during this period. We have contacted the Computation Facility to try and find out why this happened.
- 14:10. The CF is investigating and have shutdown @cfa addresses while they do this check. No word yet on when normal service will resume.
- 16:11. Although we have had no word from the CF yet, observation batches are being delivered without any delays. It is unclear if the backlog of non-delivered messages has been cleared yet.
- Problem with MPEC 2013-Y09
2013 December 18: 21:30. A problem with the filing of the newly-referenced orbits from MPEC 2013-Y09 has been identified. The orbits will be republished on the next DOU MPEC and will be accessible in, e.g., the MPES after that publication.
- Website Issue
2013 December 10: 21:50. A problem with the MPC's webserver required us to shut down the server for a while earlier today. When the system was brought back up, access to cgi scripts continued to be disabled for outside sites for a while longer, to allow us to diagnose the issue. The problem has been identified and a fix is in place. Access to cgi scripts has been restored.
- DOU MPECs
2013 November 11: 18:00. Automatic issuance of DOU MPECs will be suspended temporarily while we reconfigure the procedure. There will be manual issuance of DOU MPECs at odd time over the next few days.
- NEOCP Batch Processing
2013 November 8: 23:00. Due to a recent incident of gross misidentification of an NEOCP object by an observer, NEOCP batches will henceforth be passed through the NEWNEO procedure to try and minimize similar identification problems in the future. Observers will therefore see reports that were previously sent only upon submission of a new NEO report.
- NEOCP Problems Overnight
2013 November 7: 15:45. An over-zealous cron job deleting old temporary files ignored the "do not delete this file" protection on the script that generates the variant orbits and deleted it. Needless to say, this confused the NEWNEO procedure and caused newly-reported NEO candidate to not be posted. We have replaced the deleted procedure, modified the delete cron job to be more selective, and resubmitted the batches that failed.
- Sluggish Webserver Response
2013 October 28: 13:40. We have received reports that certain web services have been unresponsive or non-function this morning. We have determined that the load on the webserver machine is extremely high, due to a number of sites sending hundreds of requests per minute to our various services. The offending IP addresses have been blocked.
- 14:18. Problems persist, so a webserver reboot is being performed.
- Network Problems
2013 October 6: 17:00. There are networking issues amongst some of our computers. These are having a major impact on some internal operations. Replacement of at least one network card is planned for the near future. Issuance of MPECs will be delayed and last night's DOU MPEC has been abandoned.
- 21:50. An interface card has been replaced. Processes have been restarted. Various backlogs have been cleared. MPEC T39 is the recovery MPEC for 2009 VA1, not the Oct. 6 DOU MPEC.
- Plotting Issue
2013 August 4: 16:15. We have identified a difference in the environment on the new webserver (which also serves as the cgi server and the DB server) when compared to the old webserver. This difference is causing dynamically-generated plots (and other temporary files) to be written to the wrong directory. This is being fixed.
- 18:00. The temporary directory issue has been fixed.
- August 6: 19:00. A (partial) fix for the PNG display of sky coverage charts and uncertainty maps in in place. The sky coverage charts are missing the textual information and the uncertainty-map grids currently show no points. The missing font problem is being looked at.
- August 7: 14:50. Copying the PGPLOT font file from the old app server to the new server, and repointing one environment variable, returned fonts to the images generated by the sky coverage and uncertainty map scripts! The generation of NEOCP uncertainty plots has been moved back to the new server.
- Mail Issues
2013 August 4: 23:13. Four of the cluster machines are experiencing problems with name resolution and e-mail. One machine that was sending email at 21:30 is no longer. The reasons for this are unclear, but it is having an impact on the operation of a number of automated routines. We are investigating.
- August 5: 14:00. After attempting to modify the existing configuration, we gave up and reconfigured the TCPIP software from scratch on all our machines. We believe all issues to be fixed.
- Some CGI Issues on the New Web Server
2013 July 31: 00:07. Dynamic image generation, as used in the cgi scripts that return uncertainty maps or sky coverage plots, are currently off-line for two reasons. The first problem was incorrect protection on the temporary directory used to store the images--this is fixed. The other problem is that PGPLOT graphics library is not installed. This is being worked on, but the automatic installation procedure is apparently not working (why am I not surprised...), so the installation may have to be done manually.
- 14:00. The PGPLOT library is installed, but the required PNG support is disabled, due to an incompatability between the PGPLOT library and the libpng library used for PNGs. This incompatability occurs on newer versions of the OS. Attempts to switch back to GIF images generated segfaults, so there may be incompatabilities here, too. We are investigating our options.
- Aug. 1: 11:33. We are trying to get the uncertainty maps for NEOCP working on the old app server. This requires a sync between the new and old machines every time the NEOCP is updated. All the necessary processes are written, but there is a protection problem with the sync process. We are trying an alternate approach.
- Aug. 1: 11:50. Textual information on NEOCP uncertainties is still functional, as is PS output of Sky Coverage plots.
- New Web Server
2013 July 29: 21:00. A new look MPC website on a new web server is being brought on-line today. The internal changes necessary to copy material to the new web server machine are complete. The switchover of the DNS records to allow external users transparent access to the new site will be done in about 90 minutes. It may take some hours for the DNS changes to propagate to all DNS servers.
- July 30: 00:50. The DNS records were updated around 22:30. The change was visible on home systems within roughly 30 minutes. The DNS record for the alias scully, which points to the script server, has to be updated by the Computation Facility. A request was put in a few hours ago, but it is unknown when the change will be made.
- Physical Move for Some MPC Computers
2013 July 24: 15:30. The MPC computers that remain at 60 Garden Street will be physically moved to CDP late on Thursday, July 25. Automated procedures will be shut down some time prior to the (as yet unknown) move time. An update will appear here when the move is completed.
- July 25: 13:45. The automated processes are being shutdown in preparation for the physical move. Patch installation will be performed before the move.
- July 26: 00:30. The patches have been installed, the systems shutdown, transported over to CDP, reconnected, powered up and the cluster has been reformed. Unfortunately, there appears to be a network issue related to the new IP addresses necessitated by the move that prevents our machine talking to any machine outside the cluster. There is nothing we can do to fix the problem. It will require the involvement of the CF and that will not happen before the morning.
- July 26: 20:40. The network problems have been resolved (incorrect configuration information supplied by the CF). We are doing some checking before restarting the automated processing routines.
- July 26: 21:20. It turns out all our key authentications are now broken due to required assignment of new IP addresses. This will need fixing before we can restart any automated processes.
- July 27: 02:00. Automated processes are being restarted.
- July 27: 13.00. Most automated processes are restarted. DOU MPEC should resume tonight.
- Two Incorrect Numbered Orbits
2013 July 24: 15:30. Following a query about non-NEOs being flagged as NEOs, we have determined that the orbits for (103790), published on MPS 266361, and (105071), published on MPS 266365 refer to the numbered objects (363790) and (365071), respectively. The problem arose because both these objects have radar observations and, at the time these orbits were prepared, the radar-orbit preparation routine did not handle designations above (359999) correctly. The radar-orbit preparation routine has been fixed and new orbits will appear on the next DOU MPEC
- Planned Power Interruption at CDP
2013 July 8: 19:05. We have been informed that there will be a power outage at CDP on Wednesday, July 10, in order to repair/replace electrical circuit breaker surge protection equipment. This is related to the May 16 issue. Computer systems, including our public-facing systems, will be shutdown starting at 20:30 UTC and power should be restored by 01:00 UTC on July 11.
- Web Site Issues
2013 July 3: 12:30. It appears that the MPC webserver is unavailable. A large number of ruby jobs (presumably doing DB updates) are running and this is clogging up the server. The script server (aliased as scully.cfa.harvard.edu) is working normally. We are investigating.
- Misassignment of Minor Planet Names in June
2013 June 26: 12:48. Due to a production error, many of the new names accepted in the June Minor Planet Circulars were assigned to the wrong minor planet. The names will be republished in the July Circulars.
- Unexpected Power Interruption May 16
2013 May 16: 16:00. We had one hour's notice that power to the building where part of our computer system resides would be turned off to allow engineering work. We were able to stop the automated processes and dismount the shadow sets, but were unable to stop the machines restarting when power was restored. This uncontrolled power up has caused major problems to the cluster. One machine is off line (service has been called) and one external disk box is not sharing its disks (one member of each of four three-member shadow sets). No data has been lost, but until the affected machine has been repaired, certain internal tasks may be delayed.
- One feature that we cannot easily replicate while the off-line machine is unavailable is complete datasets necessary for the various MPChecker facilities. Our internal checking routines are not affected by this problem.
- There is a workaround in place for the MPChecker services. Note that full functionality is not yet back in place as the service is missing the nearby-epoch data for NEOs and the historical data for all objects.
- The May Minor Planet Circulars batch will not be issued.
- Network Interruption May 14
2013 May 14: 16:00. We have been reminded that the main network router at 60 Garden Street will be upgraded during the time period 21:30 UTC on May 14 to 01:30 UTC on May 15. MPC webservices should remain accessible during this network outage, but updating of pages will not be possible due to internal network disruption.
- Potential Service Interruptions May 18
2013 May 14: 13:25. We have been informed that Computation Facility services will be unavailable between 13:00 and 18:00 UTC on Saturday, May 18. CF webservices and e-mail service will be unavailable during this period. The MPC webservices (and the updating of data used by the services) should be unaffected by this downtime.
- SKYCOV Restored
2013 Apr. 8: 16:26. The SKYCOV service has been restored to full functionality and sky coverage data is again available up to the current date.
- SKYCOV and DELCODE Removals
2013 Apr. 6: 01:52. We are now looking at repairing the SKYCOV service. This will also fix the problem of the automatic removal of DELCODE requests. In the meantime, the script that validates DELCODE requests for removal will forward the request to a staff member for removal.
- Apr. 7: 13:45. Automated removal of DELCODE requests was reactivated last night. Please report any problems with removals.
- Apr. 7: 15:45. The SKYCOV service is almost restored. Batches of sky coverage are being copied to a temporary directory on the app server. We just need to move those copied files to their final locations and rebuild the catalogue of available files. These tasks will be done tomorrow.
- File Protection Problems on Web Server
2013 Mar. 27: 12:15. Following the updating over the past two nights of various files on the web server machine, it appears that the transfer process has suddenly started disabling world/other read permission on two of the many files that it copies. This is annoying as it prevents the MPES from running. This problem has occurred in the past, for different files over varying periods of time. We are investigating the best way to locate problem files and reset the permissions if found to be wonky.
- 12:45. The potential source of the problem has been identified and a fix has been implemented. As to why only two of four transferred files (all unpacked into the same directory) were affected by this problem, we have no idea!
- AUTOACK Problem
2013 Mar. 25: 20:32. The machine that sends out ACKs and processes incoming e-mails is having hardware issues. We are investigating and attempting to move processes to another machine.
- Mar. 26: 16:31. A rebuild of the process was required. A functioning version of AUTOACK is now running on another machine. Some recent observations batches have been ACK'ed by this new routine running in test mode. Other observation batches that arrived after Mar. 25 17:16:21 UTC and before March 26, which were processed manually during the outage, will be forwarded to AUTOACK. Observers may receive two (or more) ACKs for messages. A semblence of normalcy has been restored, further work will be necessary to add functionality to the new AUTOACK process.
- Mar. 26: 14:10. The SKYCOV service has also been affected. A partial fix to this problem is forthcoming (accepting of incoming e-mails), but a full fix will have to wait until after the March Minor Planet Circulars are prepared.
- Mar. 26: 20:14. Extensive testing of the new routine showed up some problems, which have all been fixed. Automated ACKs have resumed.
- E-mail Submission of Observations Via mpc@cfa Shut Off
2013 Mar. 4: 13:32. As noted on MPEC 2013-A21, the e-mail address mpc@cfa can no longer be used for submission of observations. E-mail submission of observations must be via obs@cfa. Any future submissions of observations to mpc@cfa will not be acknowledged. The e-mail address mpc@cfa will continue as a general contact address for the MPC.
- No Mid-Month MPS Batch
2013 Feb. 8: 23.55. As a direct result of this weekend's major snow storm in New England and the consequent major risk of extended power outages, there will be no mid-month MPS batch issued this weekend.
- Problem With MPES
2013 Jan. 30: 15:14. It appears that there is a problem with one (or more) of the data files used by the MPES. We are investigating.
- 16:14. The problem with the MPES data files has been located and fixed.
- Intermittent Network Problems
2013 Jan. 9: 22:40. We have received a report from our CF that Harvard is investigating intermittent network connectivity problems. We have already received a report of a script being temporarily inaccessible.
- Extended Power Outage at CDP
2013 Jan. 3: 15:01. There was a four-hour power outage at CDP between 07:30 and 11:20 this morning. The website and scripts should again be visible. Internal operations are being restarted as necessary.
- 21:10. We have been informed that the power issue this morning fried one of the main electrical power fuses in the computer room. It will be replaced at 22:00 on Monday, January 14. This will require shutting off the power. The downtime is expected to be under 30 minutes, but one hour is the announced timeframe. Website and scripts will be off-line from around 21:30 until power is restored.
- Archive of older enhancements and problems/resolutions