Hi All,
I'm designing a LiFePO batterycharger/UPS for a client. Our client supplies the battery, equipped with a BMS (not of our design) based on a BQ3060. During charge and discharge there is regular communication (at a 1 second interval) with the BMS to obtain it's capacity, voltage, current, status, etc. On the desk everything seems to work fine, but we're getting complaints of defective units.
A closer look indicated errors during SMBus/I2C communication. The BMS ACKs its own address, but NACKs the command following it (like 0x09 to read the voltage). Just reading results in a reply with 0x17. The problem has no further relation with our hardware, since disconnecting it, and attaching it to another unit yields identical behavior. By shorting the REG27 the BMS resumes normal operation. The strange thing is that all the primairy functions of the BMS are work fine in this condition, I could short the battery without it blowing up, and the BMS disconnected the battery to protect it from deep discharging.
Some scope images to illustrate the problem:
Here everything works, reading the voltage (this is after the hard-reset by shorting REG27)
Image may be NSFW.
Clik here to view.
Here the BQ3060 ACKs its address, but NACKs the command following it.
Image may be NSFW.
Clik here to view.
The strange "hops" in the voltage are caused by the levelshifter circuit I had to implement (the signal is sampled at SCL/SDA, on our board, not at the BQ3060 side). The designers of the BMS disconnect the GND of the battery, using N-Ch FETS as opposed to the P-Ch ones as suggested by the application note. (this potentially leads to the voltage on the SDA/SCL to turn negative when the BMS disconnects)
Diagram of the levelshifter:
Image may be NSFW.
Clik here to view.
SDA/SCL are connected to the main processor on our board, connector P4 connects to the BMS. The ground of the BMS is not wired separately because that would bypass the N-Channel fets. Unfortunately we don't posses the schematic of the BMS, the supplier is reluctant to hand it over. (we might eventually design it ourselves but we need to get this resolved).
When the BMS is reset, it continues to work fine. Some units even work for weeks now, without any issue.
The problem is that I cannot reproduce it, even moderate ESD injection doesn't lead to the above behavior. I can't find any errata of the BQ3060, I would gladly write a software workaround, but we're even considering hacking a relay between the REG27 pin and ground to have the possibility of resetting the BQ3060 ourselves when it does not respond... Anyone got any ideas on what could possibly trigger this condition?
Regards,
Dimitri Princen