11/7/08 - JK Began backing up the C: and D: of the entire CSP-AD server using acronis true image enterprise edition to external drive on CSP-BACKUP server. 7:30pm - JK Came back @ 7:30pm to find that backup completed successfully. Begin repartitioning of drives to allocate free space to 25GB C: and 180GB D: using Acronis Disk Director Server 11:00pm - JK Came back @ 11:00pm to find that Disk Director repartitioning has completed. Reboot CSP-AD. LSASS.EXE error upon startup - Active Directory is corrupt. Check the images that backed up at 4:30-7:30pm and they are gone - don't know where they went. Suspect that they didn't finish copying over LAN although when checked it appeared there was no network traffic and the drive held the image files. THERE ARE NO GOOD BACKUPS OF CSP-AD RIGHT BEFORE THE REPARTITIONING Checked the backups from backup exec - appears there is a good backup of system state and all file of 11/7/08 ~ 3:30am. We can restore from "tape" if needed, but not preferred. Started investigating KB on Lsass.exe error - appear that it is indeed an AD corruption issue. 11:45pm - Notify Bob Kennedy that network will be down all weekend and that we will not be back up and running as scheduled. Exchange is down Private.cspinc.com - web server is unresponsive. Multiple DNS on DMZ - no routes for secondary DNS server on the Cisco PIX -> DMZ -> LAN Will need to come in tomorrow morning to begin troubleshooting repair process ******11/8/08 - SATURDAY - JK******* Restored CSP passwords file from backup to get passwords for network Booted into Directory Services Restore mode, administrator password (local) not working from documentation that is current. Reset password (local) using linux util NT password recovery to what is on docs. Not working. Reset again to blank - is now working Successfully can boot into DSR mode. setup tempadmin/tempadmin user just in case administrator account fails again. had to do through command line "net user tempadmin tempadmin /add" then "net user administrators tempadmin /add" so it could be done in DSR. Researched KB for futher steps. Took a full backup Acronis image of server in current state --- CSP-AD Took a full backup Acronis image of server in current state --- CSP-BACKUP That way if all goes wrong can restore to this point in time. Backed up to disk on CSP-FTP server in "temp" directory Note: Exchange is definitely not up because the GCL is AD - role must me transferred/duplicated to another server which in this case will be CSP-BACKUP. KB272552 NOTE:::: MAY NEED TO SIEZE ROLES EVENTUALLY FROM CSP-AD AND PUT THEM ONTO A FUNCTIONAL DOMAIN CONTROLLER ----- CSP-BACKUP. THIS IS NOT A PREFERRED METHOD, NEED TO EXHAUST ALL CAPABILITES BEFORE SIEZING ROLES BECAUSE YOU CAN NOT ADD THE CSP-AD SERVER BACK IN AS A DOMAIN CONTROLLER AFTER DOING SO!!!!!! Boot AD into DSR mode - restored the System state recovery from backupexec dated 11/7/08 ---- lsass.exe error still there after successful restore. Attempt DCPROMO to gracefully demote the csp-ad server and transfer all 5 FSMO roles to csp-backup (backup domain controller) but you can not run DCPROMO from DSR. attempted to transfer roles on CSP-BACKUP but it can not contact CSP-AD since it is in DSR mode. BEGINNING STEPS TO SIEZE THE ROLES FROM CSP-AD 1. Active directory got corrupted. 2. Since could not boot in normal mode and as a result could not demote the DC gracefully or forcefully demoted by changing the key HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\ProductOptions to ServerNT to trick it into thinking it was a workgroup server. 3. Changed the membership to workgroup and reboot. 4. Promoted to workgroup server to Microsoft.local by pointing the DNS to itself and then demoted gracefully. 5. Did a metadata clean up of the DC CSP-AD from active directory hacking up the schema using ADSI Edit - removed all 6. Deleted the records for CSP-AD from DNS. 7. Seized the roles on CSP-Backup DC. 8. Promoted CSP-Backup to Global Catalog Server. 9. Ran dcdiag on CSP-Backup server and found that it is DNS test connectivity and could not find authoritative time server. 10. Found that DNS is trying to load duplicate zones "cspinc.local" from DomainDNSZones and ForestDNSZones. 11. Deleted the zones from DomainDNSZones and ForestDNSZones using adsiedit. 12. Made CSP-Backup as authoritative time server by changing value to "NTP" from "NT5DS" and change the value of announce flags to decimal "5". 13. Restart windows time server and ran command "w32tm /resync /rediscover" to synchronize time successfully from "time.windows.com" 14. Ran dcdiag and found it to be clean. 15. Promoted CSP-AD workgroup machine to DC by pointing the DNS to CSP-Backup server. 16. Machine promoted successfully and checked replication to be fine. 17. Restarted exchange services and found to be working fine. *****SUNDAY 11/9/08 ******** Travel to CSP @ 9:00am begin troubleshooting why the PIX is not transferring DNS traffic. ------- revert cisco pix settings back now that CSP-AD is up and running DNS/DHCP services. Confirm operational. Made CSP-BACKUP a preferred time server through registry. Fixed DNS & AD replication problems we've had in the past before lsass.exe problem using ADSI Edit.. CSP-AD set new local administrator password as the same on document. Troubleshoot why VOICEMAIL at CSP-UNITy is not working - need to make CSP-AD a GCL again. After doing to the webmail /mobile phone access came online. Reboot unity and now it is transferring voicemails properly. However, voicemails from this weekend did not transfer properly and they are lost becuase exchange could not find a GCL. Faxing is not working - restart services and test 919-424-2070, now it is working properly. fixed at 1:35pm CSP-WEB is having event log errors, event ID 9153. Corrected by putting exchange into native mode per KB328931. After doing so wait 13 minutes, no long having any issues with this event log entry - resolved. Remotely set D: to active on CSP-AD since it is only showing up as unallocated space. Data came back but showing as "RAW" partition instead of NTFS in drive properties. Showing as NTFS in disk management. Reboot CSP-AD and perform checkdisk. Shares came back as intended. DHCP and DNS are functioning well - needed to transfer the zone cspinc.com to the CSP-AD server because it is not set to transfer from CSP-BACKUP. Copied zone, all working fine now when making entries on both servers. Tested Tigerpaw, webmail, phone access, faxing, voicmail and inter-office mail. All is working correctly. Putting this service order in HOLD STATUS until Monday morning. Going home. *note* - will need to recreate WMI for LPI gpo - it was corrupt and needed to be deleted. **** Monday 11/10/08 ***** Came in to find that RA has restarted some services and IIS on the csp-exch server, email is being spooled in postini and phones are not working. Verify event logs - exchange has stopped responding. Reboot web server - now exch still not working - check logs on csp-exch needs a restart. Restart CSP-EXCH - came back up , postini stopped spooling mail and all email is flowing again. HOLD>