Commit Graph

2442 Commits

Author SHA1 Message Date
Mathias Krause
734c2363eb scsi: lpfc: Properly set WC for DPP mapping
[ Upstream commit bffda93a51 ]

Using set_memory_wc() to enable write-combining for the DPP portion of
the MMIO mapping is wrong as set_memory_*() is meant to operate on RAM
only, not MMIO mappings. In fact, as used currently triggers a BUG_ON()
with enabled CONFIG_DEBUG_VIRTUAL.

Simply map the DPP region separately and in addition to the already
existing mappings, avoiding any possible negative side effects for
these.

Fixes: 1351e69fc6 ("scsi: lpfc: Add push-to-adapter support to sli4")
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Reviewed-by: Mathias Krause <minipli@grsecurity.net>
Link: https://patch.msgid.link/20260212192327.141104-1-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2026-03-13 17:20:16 +01:00
Justin Tee
4ae7e2d72d scsi: lpfc: Ensure PLOGI_ACC is sent prior to PRLI in Point to Point topology
[ Upstream commit 2bf81856a4 ]

There is a timing race condition when a PRLI may be sent on the wire
before PLOGI_ACC in Point to Point topology.  Fix by deferring REG_RPI
mbox completion handling to after PLOGI_ACC's CQE completion.  Because
the discovery state machine only sends PRLI after REG_RPI mbox
completion, PRLI is now guaranteed to be sent after PLOGI_ACC.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Message-ID: <20250915180811.137530-8-justintee8345@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-11-13 15:34:26 -05:00
Justin Tee
fdf019f2a3 scsi: lpfc: Define size of debugfs entry for xri rebalancing
[ Upstream commit 5de09770b1 ]

To assist in debugging lpfc_xri_rebalancing driver parameter, a debugfs
entry is used.  The debugfs file operations for xri rebalancing have
been previously implemented, but lack definition for its information
buffer size.  Similar to other pre-existing debugfs entry buffers,
define LPFC_HDWQINFO_SIZE as 8192 bytes.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Message-ID: <20250915180811.137530-9-justintee8345@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-11-13 15:34:26 -05:00
Justin Tee
78273bfb21 scsi: lpfc: Remove ndlp kref decrement clause for F_Port_Ctrl in lpfc_cleanup
[ Upstream commit a4809b98eb ]

In lpfc_cleanup, there is an extraneous nlp_put for NPIV ports on the
F_Port_Ctrl ndlp object.  In cases when an ABTS is issued, the
outstanding kref is needed for when a second XRI_ABORTED CQE is
received.  The final kref for the ndlp is designed to be decremented in
lpfc_sli4_els_xri_aborted instead.  Also, add a new log message to allow
for future diagnostics when debugging related issues.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Message-ID: <20250915180811.137530-5-justintee8345@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-11-13 15:34:26 -05:00
Justin Tee
dd475ead4b scsi: lpfc: Check return status of lpfc_reset_flush_io_context during TGT_RESET
[ Upstream commit f408dde246 ]

If lpfc_reset_flush_io_context fails to execute, then the wrong return
status code may be passed back to upper layers when issuing a target
reset TMF command.  Fix by checking the return status from
lpfc_reset_flush_io_context() first in order to properly return FAILED
or FAST_IO_FAIL.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Message-ID: <20250915180811.137530-7-justintee8345@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-11-13 15:34:26 -05:00
Justin Tee
90b0209572 scsi: lpfc: Decrement ndlp kref after FDISC retries exhausted
[ Upstream commit b5bf6d681f ]

The kref for Fabric_DID ndlps is not decremented after repeated FDISC
failures and exhausting maximum allowed retries.  This can leave the
ndlp lingering unnecessarily.  Add a test and set bit operation for the
NLP_DROPPED flag. If not previously set, then a kref is decremented. The
ndlp is freed when the remaining reference for the completing ELS is
put.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Message-ID: <20250915180811.137530-6-justintee8345@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-11-13 15:34:25 -05:00
Justin Tee
234cb3ca07 scsi: lpfc: Clean up allocated queues when queue setup mbox commands fail
[ Upstream commit 803dfd83df ]

lpfc_sli4_queue_setup() does not allocate memory and is used for
submitting CREATE_QUEUE mailbox commands.  Thus, if such mailbox
commands fail we should clean up by also freeing the memory allocated
for the queues with lpfc_sli4_queue_destroy().  Change the intended
clean up label for the lpfc_sli4_queue_setup() error case to
out_destroy_queue.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Message-ID: <20250915180811.137530-4-justintee8345@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-11-13 15:34:25 -05:00
John Evans
367cb5ffd8 scsi: lpfc: Fix buffer free/clear order in deferred receive path
commit 9dba9a45c3 upstream.

Fix a use-after-free window by correcting the buffer release sequence in
the deferred receive path. The code freed the RQ buffer first and only
then cleared the context pointer under the lock. Concurrent paths (e.g.,
ABTS and the repost path) also inspect and release the same pointer under
the lock, so the old order could lead to double-free/UAF.

Note that the repost path already uses the correct pattern: detach the
pointer under the lock, then free it after dropping the lock. The
deferred path should do the same.

Fixes: 472e146d1c ("scsi: lpfc: Correct upcalling nvmet_fc transport during io done downcall")
Cc: stable@vger.kernel.org
Signed-off-by: John Evans <evans1210144@gmail.com>
Link: https://lore.kernel.org/r/20250828044008.743-1-evans1210144@gmail.com
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-09-09 18:58:18 +02:00
Jiasheng Jiang
7fdc6efef6 scsi: lpfc: Remove redundant assignment to avoid memory leak
[ Upstream commit eea6cafb58 ]

Remove the redundant assignment if kzalloc() succeeds to avoid memory
leak.

Fixes: bd2cdd5e40 ("scsi: lpfc: NVME Initiator: Add debugfs support")
Signed-off-by: Jiasheng Jiang <jiashengjiangcool@gmail.com>
Link: https://lore.kernel.org/r/20250801185202.42631-1-jiashengjiangcool@gmail.com
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-20 18:30:50 +02:00
Justin Tee
571617f171 scsi: lpfc: Check for hdwq null ptr when cleaning up lpfc_vport structure
[ Upstream commit 6698796282 ]

If a call to lpfc_sli4_read_rev() from lpfc_sli4_hba_setup() fails, the
resultant cleanup routine lpfc_sli4_vport_delete_fcp_xri_aborted() may
occur before sli4_hba.hdwqs are allocated.  This may result in a null
pointer dereference when attempting to take the abts_io_buf_list_lock for
the first hardware queue.  Fix by adding a null ptr check on
phba->sli4_hba.hdwq and early return because this situation means there
must have been an error during port initialization.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250618192138.124116-4-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-20 18:30:44 +02:00
Justin Tee
64d853788f scsi: lpfc: Ensure HBA_SETUP flag is used only for SLI4 in dev_loss_tmo_callbk
[ Upstream commit 1cced5779e ]

For SLI3, the HBA_SETUP flag is never set so the lpfc_dev_loss_tmo_callbk
always early returns.  Add a phba->sli_rev check for SLI4 mode so that
the SLI3 path can flow through the original dev_loss_tmo worker thread
design to lpfc_dev_loss_tmo_handler instead of early return.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250618192138.124116-9-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-08-20 18:30:43 +02:00
Ewan D. Milne
c0527f7534 scsi: lpfc: Restore clearing of NLP_UNREG_INP in ndlp->nlp_flag
[ Upstream commit 040492ac25 ]

Commit 32566a6f1a ("scsi: lpfc: Remove NLP_RELEASE_RPI flag from nodelist
structure") introduced a regression with SLI-3 adapters (e.g. LPe12000 8Gb)
where a Link Down / Link Up such as caused by disabling an host FC switch
port would result in the devices remaining in the transport-offline state
and multipath reporting them as failed.  This problem was not seen with
newer SLI-4 adapters.

The problem was caused by portions of the patch which removed the functions
__lpfc_sli_rpi_release() and lpfc_sli_rpi_release() and all their callers.
This was presumably because with the removal of the NLP_RELEASE_RPI flag
there was no need to free the rpi.

However, __lpfc_sli_rpi_release() and lpfc_sli_rpi_release() which calls it
reset the NLP_UNREG_INP flag. And, lpfc_sli_def_mbox_cmpl() has a path
where __lpfc_sli_rpi_release() was called in a particular case where
NLP_UNREG_INP was not otherwise cleared because of other conditions.

Restoring the else clause of this conditional and simply clearing the
NLP_UNREG_INP flag appears to resolve the problem with SLI-3 adapters.  It
should be noted that the code path in question is not specific to SLI-3,
but there are other SLI-4 code paths which may have masked the issue.

Fixes: 32566a6f1a ("scsi: lpfc: Remove NLP_RELEASE_RPI flag from nodelist structure")
Cc: stable@vger.kernel.org
Tested-by: Marco Patalano <mpatalan@redhat.com>
Signed-off-by: Ewan D. Milne <emilne@redhat.com>
Link: https://lore.kernel.org/r/20250317163731.356873-1-emilne@redhat.com
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-10 16:05:05 +02:00
Justin Tee
ea405fb414 scsi: lpfc: Avoid potential ndlp use-after-free in dev_loss_tmo_callbk
[ Upstream commit b5162bb6aa ]

Smatch detected a potential use-after-free of an ndlp oject in
dev_loss_tmo_callbk during driver unload or fatal error handling.

Fix by reordering code to avoid potential use-after-free if initial
nodelist reference has been previously removed.

Fixes: 4281f44ea8 ("scsi: lpfc: Prevent NDLP reference count underflow in dev_loss_tmo callback")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/linux-scsi/41c1d855-9eb5-416f-ac12-8b61929201a3@stanley.mountain/
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250425194806.3585-6-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-10 16:05:00 +02:00
Justin Tee
6857cbf0e4 scsi: lpfc: Change lpfc_nodelist nlp_flag member into a bitmask
[ Upstream commit 92b99f1a73 ]

In attempt to reduce the amount of unnecessary ndlp->lock acquisitions
in the lpfc driver, change nlpa_flag into an unsigned long bitmask and
use clear_bit/test_bit bitwise atomic APIs instead of reliance on
ndlp->lock for synchronization.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20241031223219.152342-10-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Stable-dep-of: b5162bb6aa ("scsi: lpfc: Avoid potential ndlp use-after-free in dev_loss_tmo_callbk")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-10 16:04:59 +02:00
Justin Tee
ae082dbcef scsi: lpfc: Remove NLP_RELEASE_RPI flag from nodelist structure
[ Upstream commit 32566a6f1a ]

An RPI is tightly bound to an NDLP structure and is freed only upon
release of an NDLP object.  As such, there should be no logic that frees
an RPI outside of the lpfc_nlp_release() routine.  In order to reinforce
the original design usage of RPIs, remove the NLP_RELEASE_RPI flag and
related logic.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20241031223219.152342-9-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Stable-dep-of: b5162bb6aa ("scsi: lpfc: Avoid potential ndlp use-after-free in dev_loss_tmo_callbk")
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-07-10 16:04:59 +02:00
Daniel Wagner
2f63bf0d2b scsi: lpfc: Use memcpy() for BIOS version
[ Upstream commit ae82eaf4ae ]

The strlcat() with FORTIFY support is triggering a panic because it
thinks the target buffer will overflow although the correct target
buffer size is passed in.

Anyway, instead of memset() with 0 followed by a strlcat(), just use
memcpy() and ensure that the resulting buffer is NULL terminated.

BIOSVersion is only used for the lpfc_printf_log() which expects a
properly terminated string.

Signed-off-by: Daniel Wagner <wagi@kernel.org>
Link: https://lore.kernel.org/r/20250409-fix-lpfc-bios-str-v1-1-05dac9e51e13@kernel.org
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-27 11:11:34 +01:00
Justin Tee
ab3f6cf370 scsi: lpfc: Fix lpfc_check_sli_ndlp() handling for GEN_REQUEST64 commands
[ Upstream commit 05ae6c9c73 ]

In lpfc_check_sli_ndlp(), the get_job_els_rsp64_did remote_id assignment
does not apply for GEN_REQUEST64 commands as it only has meaning for a
ELS_REQUEST64 command.  So, if (iocb->ndlp == ndlp) is false, we could
erroneously return the wrong value.  Fix by replacing the fallthrough
statement with a break statement before the remote_id check.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250425194806.3585-2-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-06-27 11:11:32 +01:00
Justin Tee
1be28b37a6 scsi: lpfc: Free phba irq in lpfc_sli4_enable_msi() when pci_irq_vector() fails
[ Upstream commit f0842902b3 ]

Fix smatch warning regarding missed calls to free_irq().  Free the phba IRQ
in the failed pci_irq_vector cases.

lpfc_init.c: lpfc_sli4_enable_msi() warn: 'phba->pcidev->irq' from
             request_irq() not released.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250131000524.163662-3-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29 11:03:03 +02:00
Justin Tee
609bc6e9c1 scsi: lpfc: Ignore ndlp rport mismatch in dev_loss_tmo callbk
[ Upstream commit 23ed628977 ]

With repeated port swaps between separate fabrics, there can be multiple
registrations for fabric well known address 0xfffffe.  This can cause ndlp
reference confusion due to the usage of a single ndlp ptr that stores the
rport object in fc_rport struct private storage during transport
registration.  Subsequent registrations update the ndlp->rport field with
the newer rport, so when transport layer triggers dev_loss_tmo for the
earlier registered rport the ndlp->rport private storage is referencing the
newer rport instead of the older rport in dev_loss_tmo callbk.

Because the older ndlp->rport object is already cleaned up elsewhere in
driver code during the time of fabric swap, check that the rport provided
in dev_loss_tmo callbk actually matches the rport stored in the LLDD's
ndlp->rport field.  Otherwise, skip dev_loss_tmo work on a stale rport.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250131000524.163662-4-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29 11:03:03 +02:00
Justin Tee
c670902775 scsi: lpfc: Handle duplicate D_IDs in ndlp search-by D_ID routine
[ Upstream commit 56c3d809b7 ]

After a port swap between separate fabrics, there may be multiple nodes in
the vport's fc_nodes list with the same fabric well known address.
Duplication is temporary and eventually resolves itself after dev_loss_tmo
expires, but nameserver queries may still occur before dev_loss_tmo.  This
possibly results in returning stale fabric ndlp objects.  Fix by adding an
nlp_state check to ensure the ndlp search routine returns the correct newer
allocated ndlp fabric object.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250131000524.163662-5-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29 11:03:03 +02:00
Justin Tee
e4913d4bc5 scsi: lpfc: Prevent NDLP reference count underflow in dev_loss_tmo callback
[ Upstream commit 4281f44ea8 ]

Current dev_loss_tmo handling checks whether there has been a previous
call to unregister with SCSI transport.  If so, the NDLP kref count is
decremented a second time in dev_loss_tmo as the final kref release.
However, this can sometimes result in a reference count underflow if
there is also a race to unregister with NVMe transport as well.  Add a
check for NVMe transport registration before decrementing the final
kref.  If NVMe transport is still registered, then the NVMe transport
unregistration is designated as the final kref decrement.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20241031223219.152342-8-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-14 20:04:00 +01:00
Justin Tee
32a2d38782 scsi: lpfc: Check SLI_ACTIVE flag in FDMI cmpl before submitting follow up FDMI
[ Upstream commit 98f8d35880 ]

The lpfc_cmpl_ct_disc_fdmi() routine has incorrect logic that treats an
FDMI completion with error LOCAL_REJECT/SLI_ABORTED as a success status.
Under the erroneous assumption of successful completion, the routine
proceeds to issue follow up FDMI commands, which may never complete if
the HBA is in an errata state as indicated by the errored completion
status.  Fix by freeing FDMI cmd resources and early return when the
LPFC_SLI_ACTIVE flag is not set and a LOCAL_REJECT/SLI_ABORTED or
SLI_DOWN status is received.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20241031223219.152342-6-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-14 20:04:00 +01:00
Justin Tee
78ef7c3909 scsi: lpfc: Call lpfc_sli4_queue_unset() in restart and rmmod paths
[ Upstream commit d35f767271 ]

During initialization, the driver allocates wq->pring in lpfc_wq_create
and lpfc_sli4_queue_unset() is the only place where kfree(wq->pring) is
called.

There is a possible memory leak in lpfc_sli_brdrestart_s4() (restart)
and lpfc_pci_remove_one_s4() (rmmod) paths because there are no calls to
lpfc_sli4_queue_unset() to kfree() the wq->pring.

Fix by inserting a call to lpfc_sli4_queue_unset() in
lpfc_sli_brdrestart_s4() and lpfc_sli4_hba_unset() routines.  Also, add
a check for the SLI_ACTIVE flag before issuing the Q_DESTROY mailbox
command.  If not set, then the mailbox command will obviously fail.  In
such cases, skip issuing the mailbox command and only execute the driver
resource clean up portions of the lpfc_*q_destroy routines.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20241031223219.152342-4-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-14 20:04:00 +01:00
Al Viro
5f60d5f6bb move asm/unaligned.h to linux/unaligned.h
asm/unaligned.h is always an include of asm-generic/unaligned.h;
might as well move that thing to linux/unaligned.h and include
that - there's nothing arch-specific in that header.

auto-generated by the following:

for i in `git grep -l -w asm/unaligned.h`; do
	sed -i -e "s/asm\/unaligned.h/linux\/unaligned.h/" $i
done
for i in `git grep -l -w asm-generic/unaligned.h`; do
	sed -i -e "s/asm-generic\/unaligned.h/linux\/unaligned.h/" $i
done
git mv include/asm-generic/unaligned.h include/linux/unaligned.h
git mv tools/include/asm-generic/unaligned.h tools/include/linux/unaligned.h
sed -i -e "/unaligned.h/d" include/asm-generic/Kbuild
sed -i -e "s/__ASM_GENERIC/__LINUX/" include/linux/unaligned.h tools/include/linux/unaligned.h
2024-10-02 17:23:23 -04:00
Linus Torvalds
3ed7df0852 Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull more SCSI updates from James Bottomley:
 "These are mostly minor updates.

  There are two drivers (lpfc and mpi3mr) which missed the initial
  pull and a core change to retry a start/stop unit which affect
  suspend/resume"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (32 commits)
  scsi: lpfc: Update lpfc version to 14.4.0.5
  scsi: lpfc: Support loopback tests with VMID enabled
  scsi: lpfc: Revise TRACE_EVENT log flag severities from KERN_ERR to KERN_WARNING
  scsi: lpfc: Ensure DA_ID handling completion before deleting an NPIV instance
  scsi: lpfc: Fix kref imbalance on fabric ndlps from dev_loss_tmo handler
  scsi: lpfc: Restrict support for 32 byte CDBs to specific HBAs
  scsi: lpfc: Update phba link state conditional before sending CMF_SYNC_WQE
  scsi: lpfc: Add ELS_RSP cmd to the list of WQEs to flush in lpfc_els_flush_cmd()
  scsi: mpi3mr: Update driver version to 8.12.0.0.50
  scsi: mpi3mr: Improve wait logic while controller transitions to READY state
  scsi: mpi3mr: Update MPI Headers to revision 34
  scsi: mpi3mr: Use firmware-provided timestamp update interval
  scsi: mpi3mr: Enhance the Enable Controller retry logic
  scsi: sd: Fix off-by-one error in sd_read_block_characteristics()
  scsi: pm8001: Do not overwrite PCI queue mapping
  scsi: scsi_debug: Remove a useless memset()
  scsi: pmcraid: Convert comma to semicolon
  scsi: sd: Retry START STOP UNIT commands
  scsi: mpi3mr: A performance fix
  scsi: ufs: qcom: Update MODE_MAX cfg_bw value
  ...
2024-09-29 09:22:34 -07:00
Linus Torvalds
a1d1eb2f57 Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
 "Updates to the usual drivers (ufs, smartpqi, NCR5380, mac_scsi, lpfc,
  mpi3mr).

  There are no user visible core changes and a whole series of minor
  updates and fixes. The largest core change is probably the
  simplification of the workqueue allocation path"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (86 commits)
  scsi: smartpqi: update driver version to 2.1.30-031
  scsi: smartpqi: fix volume size updates
  scsi: smartpqi: fix rare system hang during LUN reset
  scsi: smartpqi: add new controller PCI IDs
  scsi: smartpqi: add counter for parity write stream requests
  scsi: smartpqi: correct stream detection
  scsi: smartpqi: Add fw log to kdump
  scsi: bnx2fc: Remove some unused fields in struct bnx2fc_rport
  scsi: qla2xxx: Remove the unused 'del_list_entry' field in struct fc_port
  scsi: ufs: core: Remove ufshcd_urgent_bkops()
  scsi: core: Remove obsoleted declaration for scsi_driverbyte_string()
  scsi: bnx2i: Remove unused declarations
  scsi: core: Simplify an alloc_workqueue() invocation
  scsi: ufs: Simplify alloc*_workqueue() invocation
  scsi: stex: Simplify an alloc_ordered_workqueue() invocation
  scsi: scsi_transport_fc: Simplify alloc_workqueue() invocations
  scsi: snic: Simplify alloc_workqueue() invocations
  scsi: qedi: Simplify an alloc_workqueue() invocation
  scsi: qedf: Simplify alloc_workqueue() invocations
  scsi: myrs: Simplify an alloc_ordered_workqueue() invocation
  ...
2024-09-19 11:28:51 +02:00
Linus Torvalds
726e2d0cf2 Merge tag 'dma-mapping-6.12-2024-09-19' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping updates from Christoph Hellwig:

 - support DMA zones for arm64 systems where memory starts at > 4GB
   (Baruch Siach, Catalin Marinas)

 - support direct calls into dma-iommu and thus obsolete dma_map_ops for
   many common configurations (Leon Romanovsky)

 - add DMA-API tracing (Sean Anderson)

 - remove the not very useful return value from various dma_set_* APIs
   (Christoph Hellwig)

 - misc cleanups and minor optimizations (Chen Y, Yosry Ahmed, Christoph
   Hellwig)

* tag 'dma-mapping-6.12-2024-09-19' of git://git.infradead.org/users/hch/dma-mapping:
  dma-mapping: reflow dma_supported
  dma-mapping: reliably inform about DMA support for IOMMU
  dma-mapping: add tracing for dma-mapping API calls
  dma-mapping: use IOMMU DMA calls for common alloc/free page calls
  dma-direct: optimize page freeing when it is not addressable
  dma-mapping: clearly mark DMA ops as an architecture feature
  vdpa_sim: don't select DMA_OPS
  arm64: mm: keep low RAM dma zone
  dma-mapping: don't return errors from dma_set_max_seg_size
  dma-mapping: don't return errors from dma_set_seg_boundary
  dma-mapping: don't return errors from dma_set_min_align_mask
  scsi: check that busses support the DMA API before setting dma parameters
  arm64: mm: fix DMA zone when dma-ranges is missing
  dma-mapping: direct calls for dma-iommu
  dma-mapping: call ->unmap_page and ->unmap_sg unconditionally
  arm64: support DMA zone above 4GB
  dma-mapping: replace zone_dma_bits by zone_dma_limit
  dma-mapping: use bit masking to check VM_DMA_COHERENT
2024-09-19 11:12:49 +02:00
Martin K. Petersen
359aeb8648 Merge patch series "Update lpfc to revision 14.4.0.5"
Justin Tee <justintee8345@gmail.com> says:

Update lpfc to revision 14.4.0.5

This patch set contains bug fixes related to HBA state clean ups, FCP
discovery on older adapters, kref imbalances, log message improvements,
and support for a new diagnostic loopback testing mode.

The patches were cut against Martin's 6.12/scsi-queue tree.

Link: https://lore.kernel.org/r/20240912232447.45607-1-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 21:22:25 -04:00
Justin Tee
b071c1a909 scsi: lpfc: Update lpfc version to 14.4.0.5
Update lpfc version to 14.4.0.5

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240912232447.45607-9-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 21:21:19 -04:00
Justin Tee
eeb85c658e scsi: lpfc: Support loopback tests with VMID enabled
The VMID feature adds an extra application services header to each frame.
As such, the loopback test path is updated to accommodate the extra
application header.

Changes include filling in APPID and WQES bit fields for XMIT_SEQUENCE64
commands, a special loopback source APPID for verifying received loopback
data matches what is sent, and increasing ELS WQ size to accommodate the
APPID field in loopback test mode.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240912232447.45607-8-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 21:21:19 -04:00
Justin Tee
1af9af1f8a scsi: lpfc: Revise TRACE_EVENT log flag severities from KERN_ERR to KERN_WARNING
Revise certain log messages marked as KERN_ERR LOG_TRACE_EVENT to
KERN_WARNING and use the lpfc_vlog_msg() macro to still log the event.  The
benefit is that events of interest are still logged and the entire trace
buffer is not dumped with extraneous logging information when using default
lpfc_log_verbose driver parameter settings.

Also, delete the keyword "fail" from such log messages as they aren't
really causes for concern.  The log messages are more for warnings to a SAN
admin about SAN activity.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240912232447.45607-7-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 21:21:19 -04:00
Justin Tee
0a3c84f716 scsi: lpfc: Ensure DA_ID handling completion before deleting an NPIV instance
Deleting an NPIV instance requires all fabric ndlps to be released before
an NPIV's resources can be torn down.  Failure to release fabric ndlps
beforehand opens kref imbalance race conditions.  Fix by forcing the DA_ID
to complete synchronously with usage of wait_queue.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240912232447.45607-6-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 21:21:19 -04:00
Justin Tee
d1a2ef63fc scsi: lpfc: Fix kref imbalance on fabric ndlps from dev_loss_tmo handler
With a FLOGI outstanding and loss of physical link connection to the fabric
for the duration of dev_loss_tmo, there is a fabric ndlp kref imbalance
that decrements the kref and sets the NLP_IN_RECOV_POST_DEV_LOSS flag at
the same time.

The issue is that when the FLOGI completion routine executes, the fabric
ndlp could already be freed because of the final kref put from the
dev_loss_tmo handler.  Fix by early returning before the ndlp kref put if
the ndlp is deemed a candidate for NLP_IN_RECOV_POST_DEV_LOSS in the FLOGI
completion routine.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240912232447.45607-5-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 21:21:19 -04:00
Justin Tee
05ab4e7846 scsi: lpfc: Restrict support for 32 byte CDBs to specific HBAs
An older generation of HBAs are failing FCP discovery due to usage of an
outdated field in FCP command WQEs.

Fix by checking the SLI Interface Type register for applicable support of
32 Byte CDB commands, and restore a setting for a WQE path using normal 16
byte CDBs.

Fixes: af20bb73ac ("scsi: lpfc: Add support for 32 byte CDBs")
Cc: stable@vger.kernel.org # v6.10+
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240912232447.45607-4-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 21:21:19 -04:00
Justin Tee
fc318cac66 scsi: lpfc: Update phba link state conditional before sending CMF_SYNC_WQE
It's possible for the driver to send a CMF_SYNC_WQE to nonresponsive
firmware during reset of the adapter.  The phba link_state conditional
check is currently a strict == LPFC_LINK_DOWN, which does not cover
initialization states before reaching the LPFC_LINK_UP state.

Update the phba->link_state conditional to < LPFC_LINK_UP so that all
initialization states are covered before allowing sending CMF_SYNC_WQE.

Update taking of the hbalock to be during this link_state check as well.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240912232447.45607-3-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 21:21:19 -04:00
Justin Tee
93bcc5f398 scsi: lpfc: Add ELS_RSP cmd to the list of WQEs to flush in lpfc_els_flush_cmd()
During HBA stress testing, a spam of received PLOGIs exposes a resource
recovery bug causing leakage of lpfc_sqlq entries from the global
phba->sli4_hba.lpfc_els_sgl_list.

The issue is in lpfc_els_flush_cmd(), where the driver attempts to recover
outstanding ELS sgls when walking the txcmplq.  Only CMD_ELS_REQUEST64_CRs
and CMD_GEN_REQUEST64_CRs are added to the abort and cancel lists.  A check
for CMD_XMIT_ELS_RSP64_WQE is missing in order to recover LS_ACC usages of
the phba->sli4_hba.lpfc_els_sgl_list too.

Fix by adding CMD_XMIT_ELS_RSP64_WQE as part of the txcmplq walk when
adding WQEs to the abort and cancel list in lpfc_els_flush_cmd().  Also,
update naming convention from CRs to WQEs.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240912232447.45607-2-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 21:21:18 -04:00
Colin Ian King
c7c846fa94 scsi: lpfc: Remove trailing space after \n newline
There is a extraneous space after a newline in two lpfc_printf_log()
messages. Remove the space.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20240902150042.311157-1-colin.i.king@gmail.com
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-09-12 20:29:49 -04:00
Christoph Hellwig
334304ac2b dma-mapping: don't return errors from dma_set_max_seg_size
A NULL dev->dma_parms indicates either a bus that is not DMA capable or
grave bug in the implementation of the bus code.

There isn't much the driver can do in terms of error handling for either
case, so just warn and continue as DMA operations will fail anyway.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # For MMC
2024-08-29 07:22:49 +03:00
Sherry Yang
3417c9574e scsi: lpfc: Fix overflow build issue
Build failed while enabling "CONFIG_GCOV_KERNEL=y" and
"CONFIG_GCOV_PROFILE_ALL=y" with following error:

BUILDSTDERR: drivers/scsi/lpfc/lpfc_bsg.c: In function 'lpfc_get_cgnbuf_info':
BUILDSTDERR: ./include/linux/fortify-string.h:114:33: error: '__builtin_memcpy' accessing 18446744073709551615 bytes at offsets 0 and 0 overlaps 9223372036854775807 bytes at offset -9223372036854775808 [-Werror=restrict]
BUILDSTDERR:   114 | #define __underlying_memcpy     __builtin_memcpy
BUILDSTDERR:       |                                 ^
BUILDSTDERR: ./include/linux/fortify-string.h:637:9: note: in expansion of macro '__underlying_memcpy'
BUILDSTDERR:   637 |         __underlying_##op(p, q, __fortify_size);                        \
BUILDSTDERR:       |         ^~~~~~~~~~~~~
BUILDSTDERR: ./include/linux/fortify-string.h:682:26: note: in expansion of macro '__fortify_memcpy_chk'
BUILDSTDERR:   682 | #define memcpy(p, q, s)  __fortify_memcpy_chk(p, q, s,                  \
BUILDSTDERR:       |                          ^~~~~~~~~~~~~~~~~~~~
BUILDSTDERR: drivers/scsi/lpfc/lpfc_bsg.c:5468:9: note: in expansion of macro 'memcpy'
BUILDSTDERR:  5468 |         memcpy(cgn_buff, cp, cinfosz);
BUILDSTDERR:       |         ^~~~~~

This happens from the commit 06bb7fc0fe ("kbuild: turn on -Wrestrict by
default"). Address this issue by using size_t type.

Signed-off-by: Sherry Yang <sherry.yang@oracle.com>
Link: https://lore.kernel.org/r/20240821065131.1180791-1-sherry.yang@oracle.com
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-22 20:57:41 -04:00
Justin Tee
5b247f0377 scsi: lpfc: Copyright updates for 14.4.0.4 patches
Update copyrights to 2024 for files modified in the 14.4.0.4 patch set.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240726231512.92867-9-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-02 21:48:06 -04:00
Justin Tee
62b52495e6 scsi: lpfc: Update lpfc version to 14.4.0.4
Update lpfc version to 14.4.0.4

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240726231512.92867-8-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-02 21:48:06 -04:00
Justin Tee
1f0f7679ad scsi: lpfc: Update PRLO handling in direct attached topology
A kref imbalance occurs when handling an unsolicited PRLO in direct
attached topology.

Rework PRLO rcv handling when in MAPPED state.  Save the state that we were
handling a PRLO by setting nlp_last_elscmd to ELS_CMD_PRLO.  Then in the
lpfc_cmpl_els_logo_acc() completion routine, manually restart discovery.
By issuing the PLOGI, which nlp_gets, before nlp_put at the end of the
lpfc_cmpl_els_logo_acc() routine, we are saving us from a final nlp_put.
And, we are still allowing the unreg_rpi to happen.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240726231512.92867-7-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-02 21:48:05 -04:00
Justin Tee
b5c18c9dd1 scsi: lpfc: Fix unsolicited FLOGI kref imbalance when in direct attached topology
In direct attached topology, certain target vendors that are quick to issue
FLOGI followed by a cable pull for more than dev_loss_tmo may result in a
kref imbalance for the remote port ndlp object.

Add an nlp_get when the defer_flogi_acc flag is set.  This is expected to
balance the nlp_put in the defer_flogi_acc clause in the
lpfc_issue_els_flogi() routine.  Because we need to retain the ndlp ptr,
reorganize all of the defer_flogi_acc information into one
lpfc_defer_flogi_acc struct.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240726231512.92867-6-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-02 21:48:05 -04:00
Justin Tee
3976beb1b4 scsi: lpfc: Fix unintentional double clearing of vmid_flag
The vport->vmid_flag is unintentionally cleared twice after an issue_lip
via the lpfc_reinit_vmid routine().

The first call to lpfc_reinit_vmid() is in lpfc_cmpl_els_flogi().  Then
lpfc_cmpl_els_flogi_fabric() calls lpfc_register_new_vport(), which calls
lpfc_cmpl_reg_new_vport() when the mbox command completes and calls
lpfc_reinit_vmid() a second time.

Fix by moving the vmid_flag clear outside of the lpfc_reinit_vmid() routine
so that vmid_flag is only cleared once upon FLOGI completion.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240726231512.92867-5-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-02 21:48:05 -04:00
Justin Tee
2be1d4f119 scsi: lpfc: Validate hdwq pointers before dereferencing in reset/errata paths
When the HBA is undergoing a reset or is handling an errata event, NULL ptr
dereference crashes may occur in routines such as
lpfc_sli_flush_io_rings(), lpfc_dev_loss_tmo_callbk(), or
lpfc_abort_handler().

Add NULL ptr checks before dereferencing hdwq pointers that may have been
freed due to operations colliding with a reset or errata event handler.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240726231512.92867-4-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-02 21:48:05 -04:00
Justin Tee
f1bfe32073 scsi: lpfc: Remove redundant vport assignment when building an abort request
The lpfc_sli_issue_abort_iotag() routine has a redundant assignment of
abtsiocbp->vport = vport;

The duplicate lines are from a previous refactoring, and this patch removes
the accidental redundancy.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240726231512.92867-3-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-02 21:48:05 -04:00
Justin Tee
5b8963c53d scsi: lpfc: Change diagnostic log flag during receipt of unknown ELS cmds
During diagnostics, it has been determined that the 0115 log message for
receipt of unknown ELS cmds does not benefit from trace buffer dumps.  The
trace buffer dump floods the console with unnecessary information, and the
singular LOG_ELS flag has proven more beneficial in debugging efforts when
dealing with unknown ELS cmds.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240726231512.92867-2-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-08-02 21:48:05 -04:00
Linus Torvalds
4305ca0087 Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
 "Updates to the usual drivers (ufs, lpfc, qla2xxx, mpi3mr) plus some
  misc small fixes.

  The only core changes are to both bsg and scsi to pass in the device
  instead of setting it afterwards as q->queuedata, so no functional
  change"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (69 commits)
  scsi: aha152x: Use DECLARE_COMPLETION_ONSTACK for non-constant completion
  scsi: qla2xxx: Convert comma to semicolon
  scsi: qla2xxx: Update version to 10.02.09.300-k
  scsi: qla2xxx: Use QP lock to search for bsg
  scsi: qla2xxx: Reduce fabric scan duplicate code
  scsi: qla2xxx: Fix optrom version displayed in FDMI
  scsi: qla2xxx: During vport delete send async logout explicitly
  scsi: qla2xxx: Complete command early within lock
  scsi: qla2xxx: Fix flash read failure
  scsi: qla2xxx: Return ENOBUFS if sg_cnt is more than one for ELS cmds
  scsi: qla2xxx: Fix for possible memory corruption
  scsi: qla2xxx: validate nvme_local_port correctly
  scsi: qla2xxx: Unable to act on RSCN for port online
  scsi: ufs: exynos: Add support for Flash Memory Protector (FMP)
  scsi: ufs: core: Add UFSHCD_QUIRK_KEYS_IN_PRDT
  scsi: ufs: core: Add fill_crypto_prdt variant op
  scsi: ufs: core: Add UFSHCD_QUIRK_BROKEN_CRYPTO_ENABLE
  scsi: ufs: core: fold ufshcd_clear_keyslot() into its caller
  scsi: ufs: core: Add UFSHCD_QUIRK_CUSTOM_CRYPTO_PROFILE
  scsi: ufs: mcq: Make .get_hba_mac() optional
  ...
2024-07-19 10:56:58 -07:00
Martin K. Petersen
34438552c9 Merge patch series "Update lpfc to revision 14.4.0.3"
Justin Tee <justintee8345@gmail.com> says:

Update lpfc to revision 14.4.0.3

This patch set contains bug fixes related to discovery, submission of
mailbox commands, and proper endianness conversions.

The patches were cut against Martin's 6.11/scsi-queue tree.

Link: https://lore.kernel.org/r/20240628172011.25921-1-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:25:40 -04:00
Justin Tee
41972df1a5 scsi: lpfc: Update lpfc version to 14.4.0.3
Update lpfc version to 14.4.0.3.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240628172011.25921-9-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:24:52 -04:00