Commit Graph

152 Commits

Author SHA1 Message Date
Bjorn Helgaas 39195990e4 PCI: Correct PCI_CAP_EXP_ENDPOINT_SIZEOF_V2 value
fb82437fdd ("PCI: Change capability register offsets to hex") incorrectly
converted the PCI_CAP_EXP_ENDPOINT_SIZEOF_V2 value from decimal 52 to hex
0x32:

  -#define PCI_CAP_EXP_ENDPOINT_SIZEOF_V2 52      /* v2 endpoints with link end here */
  +#define PCI_CAP_EXP_ENDPOINT_SIZEOF_V2 0x32    /* end of v2 EPs w/ link */

This broke PCI capabilities in a VMM because subsequent ones weren't
DWORD-aligned.

Change PCI_CAP_EXP_ENDPOINT_SIZEOF_V2 to the correct value of 0x34.

fb82437fdd was from Baruch Siach <baruch@tkos.co.il>, but this was not
Baruch's fault; it's a mistake I made when applying the patch.

Fixes: fb82437fdd ("PCI: Change capability register offsets to hex")
Reported-by: David Woodhouse <dwmw2@infradead.org>
Closes: https://lore.kernel.org/all/3ae392a0158e9d9ab09a1d42150429dd8ca42791.camel@infradead.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2026-02-27 10:24:25 -06:00
Linus Torvalds e812928be2 Merge tag 'cxl-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull CXL updates from Dave Jiang:

 - Introduce cxl_memdev_attach and pave way for soft reserved handling,
   type2 accelerator enabling, and LSA 2.0 enabling. All these series
   require the endpoint driver to settle before continuing the memdev
   driver probe.

 - Address CXL port error protocol handling and reporting.

   The large patch series was split into three parts. The first two
   parts are included here with the final part coming later.

   The first part consists of a series of code refactoring to PCI AER
   sub-system that addresses CXL and also CXL RAS code to prepare for
   port error handling.

   The second part refactors the CXL code to move management of
   component registers to cxl_port objects to allow all CXL AER errors
   to be handled through the cxl_port hierarchy.

 - Provide AMD Zen5 platform address translation for CXL using ACPI
   PRMT. This includes a conventions document to explain why this is
   needed and how it's implemented.

 - Misc CXL patches of fixes, cleanups, and updates. Including CXL
   address translation for unaligned MOD3 regions.

[ TLA service: CXL is "Compute Express Link" ]

* tag 'cxl-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (59 commits)
  cxl: Disable HPA/SPA translation handlers for Normalized Addressing
  cxl/region: Factor out code into cxl_region_setup_poison()
  cxl/atl: Lock decoders that need address translation
  cxl: Enable AMD Zen5 address translation using ACPI PRMT
  cxl/acpi: Prepare use of EFI runtime services
  cxl: Introduce callback for HPA address ranges translation
  cxl/region: Use region data to get the root decoder
  cxl/region: Add @hpa_range argument to function cxl_calc_interleave_pos()
  cxl/region: Separate region parameter setup and region construction
  cxl: Simplify cxl_root_ops allocation and handling
  cxl/region: Store HPA range in struct cxl_region
  cxl/region: Store root decoder in struct cxl_region
  cxl/region: Rename misleading variable name @hpa to @hpa_range
  Documentation/driver-api/cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement
  cxl, doc: Moving conventions in separate files
  cxl, doc: Remove isonum.txt inclusion
  cxl/port: Unify endpoint and switch port lookup
  cxl/port: Move endpoint component register management to cxl_port
  cxl/port: Map Port RAS registers
  cxl/port: Move dport RAS setup to dport add time
  ...
2026-02-12 16:33:05 -08:00
Ilpo Järvinen cad3337bb6 PCI: Add dword #defines for Bus Number + Secondary Latency Timer
uapi/linux/pci_regs.h defines Primary/Secondary/Subordinate Bus Numbers
and Secondary Latency Timer (PCIe r7.0, sec. 7.5.1.3) as byte register
offsets, but in practice the code may read/write the entire dword. In the
lack of #defines to handle the dword fields, the code ends up using
literals which are not as easy to read.

Add dword field masks for the Bus Number and Secondary Latency Timer
fields and use them in probe.c.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
[bhelgaas: squash new #defines and uses together]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20251219174036.16738-21-ilpo.jarvinen@linux.intel.com
Link: https://patch.msgid.link/20251219174036.16738-22-ilpo.jarvinen@linux.intel.com
2026-01-27 16:36:53 -06:00
Terry Bowman 7c29ba0221 PCI: Introduce pcie_is_cxl()
CXL is a protocol that runs on top of PCIe electricals. Its error model
also runs on top of the PCIe AER error model by standardizing "internal"
errors as "CXL" errors. Linux has historically ignored internal errors.

CXL protocol error handling is then a task of enhancing the PCIe AER
core to understand that PCIe ports (upstream and downstream) and
endpoints may throw internal errors that represent standard CXL protocol
errors.

The proposed method to make that determination is to teach 'struct
pci_dev' to cache when its link has trained the CXL.mem and/or CXL.cache
protocols and then treat all internal errors as CXL errors. A design
goal is to not burden the PCIe AER core with CXL knowledge beyond just
enough to forward error notifications to the CXL RAS core. The forwarded
notification looks up a 'struct cxl_port' or 'struct cxl_dport'
companion device to the PCI device.

Introduce set_pcie_cxl() with logic checking for CXL.mem or CXL.cache
status in the CXL Flex Bus DVSEC status register. The CXL Flex Bus DVSEC
presence is used because it is required for all the CXL PCIe devices.[1]

[1] CXL 3.1 Spec, 8.1.1 PCIe Designated Vendor-Specific Extended
    Capability (DVSEC) ID Assignment, Table 8-2

Signed-off-by: Terry Bowman <terry.bowman@amd.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alejandro Lucero <alucerop@amd.com>
Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20260114182055.46029-4-terry.bowman@amd.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-01-22 14:55:27 -07:00
Terry Bowman 6612bd9ff0 PCI: Update CXL DVSEC definitions
CXL DVSEC definitions were recently moved into uapi/pci_regs.h, but the
newly added macros do not follow the file's existing naming conventions.
The current format uses CXL_DVSEC_XYZ, while the new CXL entries must
instead use the PCI_DVSEC_CXL_XYZ prefix to match the conventions already
established in pci_regs.h.

The new CXL DVSEC macros also introduce _MASK and _OFFSET suffixes, which
are not used anywhere else in the file. These suffixes lengthen the
identifiers and reduce readability. Remove _MASK and _OFFSET from the
recently added definitions.

Additionally, remove PCI_DVSEC_HEADER1_LENGTH, as it duplicates the existing
PCI_DVSEC_HEADER1_LEN() macro.

Update all existing references to use the new macro names.

Finally, update the inline documentation to reference the latest revision
of the CXL specification.

Signed-off-by: Terry Bowman <terry.bowman@amd.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20260114182055.46029-3-terry.bowman@amd.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-01-22 14:52:23 -07:00
Terry Bowman 0f7afd80d8 PCI: Move CXL DVSEC definitions into uapi/linux/pci_regs.h
The CXL DVSECs are currently defined in cxl/core/cxlpci.h. These are not
accessible to other subsystems. Move these to uapi/linux/pci_regs.h.

The CXL DVSEC definitions will be renamed and reformatted to fit better
with existing defines.

Signed-off-by: Terry Bowman <terry.bowman@amd.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20260114182055.46029-2-terry.bowman@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-01-22 14:50:16 -07:00
Dan Williams c0c1262fbf PCI: Add PCIe Device 3 Extended Capability enumeration
PCIe r7.0 Section 7.7.9 Device 3 Extended Capability Structure, defines the
canonical location for determining the Flit Mode of a device. This status
is a dependency for PCIe IDE enabling. Add a new fm_enabled flag to 'struct
pci_dev'.

Cc: Lukas Wunner <lukas@wunner.de>
Cc: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Samuel Ortiz <sameo@rivosinc.com>
Cc: Alexey Kardashevskiy <aik@amd.com>
Cc: Xu Yilun <yilun.xu@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20251031212902.2256310-6-dan.j.williams@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2025-11-03 19:27:41 -08:00
Dan Williams 3225f52cde PCI/TSM: Establish Secure Sessions and Link Encryption
The PCIe 7.0 specification, section 11, defines the Trusted Execution
Environment (TEE) Device Interface Security Protocol (TDISP).  This
protocol definition builds upon Component Measurement and Authentication
(CMA), and link Integrity and Data Encryption (IDE). It adds support for
assigning devices (PCI physical or virtual function) to a confidential VM
such that the assigned device is enabled to access guest private memory
protected by technologies like Intel TDX, AMD SEV-SNP, RISCV COVE, or ARM
CCA.

The "TSM" (TEE Security Manager) is a concept in the TDISP specification
of an agent that mediates between a "DSM" (Device Security Manager) and
system software in both a VMM and a confidential VM. A VMM uses TSM ABIs
to setup link security and assign devices. A confidential VM uses TSM
ABIs to transition an assigned device into the TDISP "RUN" state and
validate its configuration. From a Linux perspective the TSM abstracts
many of the details of TDISP, IDE, and CMA. Some of those details leak
through at times, but for the most part TDISP is an internal
implementation detail of the TSM.

CONFIG_PCI_TSM adds an "authenticated" attribute and "tsm/" subdirectory
to pci-sysfs. Consider that the TSM driver may itself be a PCI driver.
Userspace can watch for the arrival of a "TSM" device,
/sys/class/tsm/tsm0/uevent KOBJ_CHANGE, to know when the PCI core has
initialized TSM services.

The operations that can be executed against a PCI device are split into
two mutually exclusive operation sets, "Link" and "Security" (struct
pci_tsm_{link,security}_ops). The "Link" operations manage physical link
security properties and communication with the device's Device Security
Manager firmware. These are the host side operations in TDISP. The
"Security" operations coordinate the security state of the assigned
virtual device (TDI). These are the guest side operations in TDISP.

Only "link" (Secure Session and physical Link Encryption) operations are
defined at this stage. There are placeholders for the device security
(Trusted Computing Base entry / exit) operations.

The locking allows for multiple devices to be executing commands
simultaneously, one outstanding command per-device and an rwsem
synchronizes the implementation relative to TSM registration/unregistration
events.

Thanks to Wu Hao for his work on an early draft of this support.

Cc: Lukas Wunner <lukas@wunner.de>
Cc: Samuel Ortiz <sameo@rivosinc.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Alexey Kardashevskiy <aik@amd.com>
Co-developed-by: Xu Yilun <yilun.xu@linux.intel.com>
Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
Link: https://patch.msgid.link/20251031212902.2256310-5-dan.j.williams@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2025-11-03 19:27:41 -08:00
Dan Williams f16469ee73 PCI/IDE: Enumerate Selective Stream IDE capabilities
Link encryption is a new PCIe feature enumerated by "PCIe r7.0 section
7.9.26 IDE Extended Capability".

It is both a standalone port + endpoint capability, and a building block
for the security protocol defined by "PCIe r7.0 section 11 TEE Device
Interface Security Protocol (TDISP)". That protocol coordinates device
security setup between a platform TSM (TEE Security Manager) and a
device DSM (Device Security Manager). While the platform TSM can
allocate resources like Stream ID and manage keys, it still requires
system software to manage the IDE capability register block.

Add register definitions and basic enumeration in preparation for
Selective IDE Stream establishment. A follow on change selects the new
CONFIG_PCI_IDE symbol. Note that while the IDE specification defines
both a point-to-point "Link Stream" and a Root Port to endpoint
"Selective Stream", only "Selective Stream" is considered for Linux as
that is the predominant mode expected by Trusted Execution Environment
Security Managers (TSMs), and it is the security model that limits the
number of PCI components within the TCB in a PCIe topology with
switches.

Co-developed-by: Alexey Kardashevskiy <aik@amd.com>
Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
Co-developed-by: Xu Yilun <yilun.xu@linux.intel.com>
Signed-off-by: Xu Yilun <yilun.xu@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alexey Kardashevskiy <aik@amd.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@kernel.org>
Link: https://patch.msgid.link/20251031212902.2256310-3-dan.j.williams@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2025-11-03 19:27:40 -08:00
Bjorn Helgaas fef3530379 Merge branch 'pci/capability-search'
- Simplify __pci_find_next_cap_ttl() by replacing magic numbers with
  #defines, extracting fields with FIELD_GET(), etc (Hans Zhang)

- Convert __pci_find_next_cap_ttl() to a PCI_FIND_NEXT_CAP() macro that
  takes a config space accessor function so we can also use it in cases
  where the usual config accessors aren't available (Hans Zhang)

- Similarly convert pci_find_next_ext_capability() to a
  PCI_FIND_NEXT_EXT_CAP() macro (Hans Zhang)

- Implement dwc, dwc endpoint, and cadence capability search interfaces on
  top of PCI_FIND_NEXT_CAP() and PCI_FIND_NEXT_EXT_CAP(), replacing the
  previous duplicated code (Hans Zhang)

- Search for capabilities in the cadence core instead of hard-coding their
  offsets, which are subject to change (Hans Zhang)

* pci/capability-search:
  PCI: cadence: Use cdns_pcie_find_*capability() to avoid hardcoding offsets
  PCI: cadence: Implement capability search using PCI core APIs
  PCI: dwc: ep: Implement capability search using PCI core APIs
  PCI: dwc: Implement capability search using PCI core APIs
  PCI: Refactor extended capability search into PCI_FIND_NEXT_EXT_CAP()
  PCI: Refactor capability search into PCI_FIND_NEXT_CAP()
  PCI: Clean up __pci_find_next_cap_ttl() readability
2025-10-03 12:13:14 -05:00
Lukas Wunner c8ab5e888b PCI/AER: Print TLP Log for errors introduced since PCIe r1.1
When reporting an error, the AER driver prints the TLP Header / Prefix Log
only for errors enumerated in the AER_LOG_TLP_MASKS macro.

The macro was never amended since its introduction in 2006 with commit
6c2b374d74 ("PCI-Express AER implemetation: AER core and aerdriver").
At the time, PCIe r1.1 was the latest spec revision.

Amend the macro with errors defined since then to avoid omitting the TLP
Header / Prefix Log for newer errors.

The order of the errors in AER_LOG_TLP_MASKS follows PCIe r1.1 sec 6.2.7
rather than 7.10.2, because only the former documents for which errors a
TLP Header / Prefix is logged.  Retain this order.  The section number is
still 6.2.7 in today's PCIe r7.0.

For Completion Timeouts, the TLP Header / Prefix is only logged if the
Completion Timeout Prefix / Header Log Capable bit is set in the AER
Capabilities and Control register.  Introduce a tlp_header_logged() helper
to check whether the TLP Header / Prefix Log is populated and use it in
the two places which currently match against AER_LOG_TLP_MASKS directly.

For Uncorrectable Internal Errors, logging of the TLP Header / Prefix is
optional per PCIe r7.0 sec 6.2.7.  If needed, drivers could indicate
through a flag whether devices are capable and tlp_header_logged() could
then check that flag.

pcitools introduced macros for newer errors with commit 144b0911cc0b
("ls-ecaps: extend decode support for more fields for AER CE and UE
status"):
  https://git.kernel.org/pub/scm/utils/pciutils/pciutils.git/commit/?id=144b0911cc0b

Unfortunately some of those macros are overly long:
  PCI_ERR_UNC_POISONED_TLP_EGRESS
  PCI_ERR_UNC_DMWR_REQ_EGRESS_BLOCKED
  PCI_ERR_UNC_IDE_CHECK
  PCI_ERR_UNC_MISR_IDE_TLP
  PCI_ERR_UNC_PCRC_CHECK
  PCI_ERR_UNC_TLP_XLAT_EGRESS_BLOCKED

This seems unsuitable for <linux/pci_regs.h>, so shorten to:
  PCI_ERR_UNC_POISON_BLK
  PCI_ERR_UNC_DMWR_BLK
  PCI_ERR_UNC_IDE_CHECK
  PCI_ERR_UNC_MISR_IDE
  PCI_ERR_UNC_PCRC_CHECK
  PCI_ERR_UNC_XLAT_BLK

Note that some of the existing macros in <linux/pci_regs.h> do not match
exactly with pcitools (e.g. PCI_ERR_UNC_SDES versus PCI_ERR_UNC_SURPDN),
so it does not seem mandatory for them to be identical.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/5f707caf1260bd8f15012bb032f7da9a9b898aba.1756712066.git.lukas@wunner.de
2025-09-04 10:09:05 -05:00
Hans Zhang 37d1ade896 PCI: Clean up __pci_find_next_cap_ttl() readability
Refactor the __pci_find_next_cap_ttl() to improve code clarity:

  - Replace magic number 0x40 with PCI_STD_HEADER_SIZEOF.
  - Use ALIGN_DOWN() for position alignment instead of manual bitmask.
  - Extract PCI capability fields via FIELD_GET() with standardized masks.
  - Add necessary headers (linux/align.h).

No functional changes intended.

Signed-off-by: Hans Zhang <18255117159@163.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
Acked-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://patch.msgid.link/20250813144529.303548-2-18255117159@163.com
2025-08-14 15:03:34 -05:00
Michał Winiarski 5a8f77e24a PCI/IOV: Restore VF resizable BAR state after reset
Similar to regular resizable BARs, VF BARs can also be resized, e.g. by the
system firmware or the PCI subsystem itself.

The capability layout is the same as PCI_EXT_CAP_ID_REBAR.

Add the capability ID and restore it as a part of IOV state.

See PCIe r6.2, sec 7.8.7.

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patch.msgid.link/20250702093522.518099-2-michal.winiarski@intel.com
2025-07-14 14:58:13 -05:00
Krishna Chaitanya Chundru 178af54a67 PCI: Add lane equalization register offsets
As per PCIe spec 6.0.1, add PCIe lane equalization register offset for
data rates 8.0 GT/s, 32.0 GT/s and 64.0 GT/s.

Also add a macro for defining data rate 64.0 GT/s physical layer capability
ID.

Signed-off-by: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Link: https://patch.msgid.link/20250328-preset_v6-v9-4-22cfa0490518@oss.qualcomm.com
2025-04-19 19:42:43 +05:30
Bjorn Helgaas 38d42a6612 Merge branch 'pci/resource'
- Use pci_resource_n() to simplify BAR/window resource lookup (Ilpo
  Järvinen)

- Fix typo that repeatedly distributed resources to a bridge instead of
  iterating over subordinate bridges, which resulted in too little space to
  assign some BARs (Kai-Heng Feng)

- Relax bridge window tail sizing for optional resources, e.g., IOV BARs,
  to avoid failures when removing and re-adding devices (Ilpo Järvinen)

- Fix a double counting error for I/O resources, as we previously did for
  memory resources (Ilpo Järvinen)

- Use resource_set_{range,size}() helpers in more places (Ilpo Järvinen)

- Add pci_resource_is_iov() to identify IOV resources (Ilpo Järvinen)

- Add pci_resource_num() to look up the BAR number from the resource
  pointer (Ilpo Järvinen)

- Add restore_dev_resource() to simplify code that resources saved device
  resources (Ilpo Järvinen)

- Allow drivers to enable devices even if we haven't assigned optional IOV
  resources to them (Ilpo Järvinen)

- Improve debug output during resource reallocation (Ilpo Järvinen)

- Rework handling of optional resources (IOV BARs, ROMs) to reduce failures
  if we can't allocate them (Ilpo Järvinen)

- Move declarations of pci_rescan_bus_bridge_resize(),
  pci_reassign_bridge_resources(), and CardBus-related sizes from
  include/linux/pci.h to drivers/pci/pci.h since they're not used outside
  the PCI core (Ilpo Järvinen)

- Make pci_setup_bridge() static (Ilpo Järvinen)

- Fix a NULL dereference in the SR-IOV VF creation error path (Shay Drory)

- Fix s390 mmio_read/write syscalls, which didn't cause page faults in some
  cases, which broke vfio-pci lazy mapping on first access (Niklas
  Schnelle)

- Add pdev->non_mappable_bars to replace CONFIG_VFIO_PCI_MMAP, which was
  disabled only for s390 (Niklas Schnelle)

- Support mmap of PCI resources on s390 except for ISM devices (Niklas
  Schnelle)

* pci/resource:
  s390/pci: Support mmap() of PCI resources except for ISM devices
  s390/pci: Introduce pdev->non_mappable_bars and replace VFIO_PCI_MMAP
  s390/pci: Fix s390_mmio_read/write syscall page fault handling
  PCI: Fix NULL dereference in SR-IOV VF creation error path
  PCI: Move cardbus IO size declarations into pci/pci.h
  PCI: Make pci_setup_bridge() static
  PCI: Move resource reassignment func declarations into pci/pci.h
  PCI: Move pci_rescan_bus_bridge_resize() declaration to pci/pci.h
  PCI: Fix BAR resizing when VF BARs are assigned
  PCI: Do not claim to release resource falsely
  PCI: Increase Resizable BAR support from 512 GB to 128 TB
  PCI: Rework optional resource handling
  PCI: Perform reset_resource() and build fail list in sync
  PCI: Use res->parent to check if resource is assigned
  PCI: Add debug print when releasing resources before retry
  PCI: Indicate optional resource assignment failures
  PCI: Always have realloc_head in __assign_resources_sorted()
  PCI: Extend enable to check for any optional resource
  PCI: Add restore_dev_resource()
  PCI: Remove incorrect comment from pci_reassign_resource()
  PCI: Consolidate assignment loop next round preparation
  PCI: Rename retval to ret
  PCI: Use while loop and break instead of gotos
  PCI: Refactor pdev_sort_resources() & __dev_sort_resources()
  PCI: Converge return paths in __assign_resources_sorted()
  PCI: Add dev & res local variables to resource assignment funcs
  PCI: Add pci_resource_num() helper
  PCI: Check resource_size() separately
  PCI: Add pci_resource_is_iov() to identify IOV resources
  PCI: Use resource_set_{range,size}() helpers
  PCI: Use SZ_* instead of literals in setup-bus.c
  PCI: Fix old_size lower bound in calculate_iosize() too
  PCI: Allow relaxed bridge window tail sizing for optional resources
  PCI: Simplify size1 assignment logic
  PCI: Use min_align, not unrelated add_align, for size0
  PCI: Remove add_align overwrite unrelated to size0
  PCI: Use downstream bridges for distributing resources
  PCI: Cleanup dev->resource + resno to use pci_resource_n()
2025-03-27 13:14:45 -05:00
Bjorn Helgaas 651aa9052c Merge branch 'pci/doe'
- Rename DOE 'protocol' to 'feature' to follow spec terminology (Alistair
  Francis)

- Expose supported DOE features via sysfs (Alistair Francis)

- Allow DOE support to be enabled even if CXL isn't enabled (Alistair
  Francis)

* pci/doe:
  PCI/DOE: Allow enabling DOE without CXL
  PCI/DOE: Expose DOE features via sysfs
  PCI/DOE: Rename Discovery Response Data Object Contents to type
  PCI/DOE: Rename DOE protocol to feature
2025-03-27 13:14:43 -05:00
Zhiyuan Dai 5af473941b PCI: Increase Resizable BAR support from 512 GB to 128 TB
Per PCIe r6.0, sec 7.8.6.2, devices can advertise Resizable BAR sizes up to
128 TB in the Resizable BAR Capability register.  Larger sizes can be
advertised via the Capability register, but that requires an API change.

Update pci_rebar_get_possible_sizes() and pbus_size_mem() to increase the
sizes we currently support from 512 GB to 128 TB.

Link: https://lore.kernel.org/r/20250307053535.44918-1-daizhiyuan@phytium.com.cn
Signed-off-by: Zhiyuan Dai <daizhiyuan@phytium.com.cn>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-03-07 11:23:34 -06:00
Alistair Francis f810d17762 PCI/DOE: Rename Discovery Response Data Object Contents to type
PCIe r6.1, sec 6.30.1.1, describes a "Vendor ID", a "Data Object Type" and
"Next Index" as the fields in the DOE Discovery Response Data Object. The
DOE driver currently uses both the terms 'type' and 'prot' for the second
element.

Rename all uses of the DOE Discovery Response Data Object to use 'type' as
the second element of the object header, instead of type/prot as it
currently is.

Link: https://lore.kernel.org/r/20250306075211.1855177-2-alistair@alistair23.me
Signed-off-by: Alistair Francis <alistair@alistair23.me>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-03-06 12:48:16 -06:00
Ilpo Järvinen 7e077e6707 PCI/ERR: Handle TLP Log in Flit mode
Flit mode introduced in PCIe r6.0 alters how the TLP Header Log is
presented through AER and DPC Capability registers. The TLP Prefix Log
Register is not present with Flit mode, and the register becomes an
extension of the TLP Header Log (PCIe r6.1 secs 7.8.4.12 & 7.9.14.13).

Adapt pcie_read_tlp_log() and struct pcie_tlp_log to read and store the
extended TLP Header Log when the Link is in Flit mode. As the Prefix Log
and Extended TLP Header are not present at the same time, a C union can be
used.

Determining whether the error occurred while the Link was in Flit mode is a
bit complicated. In case of AER, the Advanced Error Capabilities and
Control Register directly tells whether the error was logged in Flit mode
or not (PCIe r6.1 sec 7.8.4.7). The DPC Capability (PCIe r6.1 sec 7.9.14),
unfortunately, does not contain the same information.

Unlike AER, the DPC Capability does not provide a way to discern whether
the error was logged in Flit mode (this is confirmed by PCI WG to be an
oversight in the spec). DPC will bring the Link down immediately following
an error, which makes it impossible to acquire the Flit Mode Status
directly from the Link Status 2 register because Flit Mode Status is only
set in certain Link states (PCIe r6.1 sec 7.5.3.20). As a workaround, use
the flit_mode value stored into the struct pci_bus.

Link: https://lore.kernel.org/r/20250207161836.2755-3-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2025-02-21 17:31:45 -06:00
Bjorn Helgaas 10ff5bbfd4 Merge branch 'pci/misc'
- Constify struct bin_attribute for sysfs, VPD, P2PDMA, and the IBM ACPI
  hotplug driver (Thomas Weißschuh)

- Update PCI_EXP_LNKCAP_SLS comment (Lukas Wunner)

- Drop superfluous pm_wakeup.h include (Wolfram Sang)

- Remove redundant PCI_VSEC_HDR and PCI_VSEC_HDR_LEN_SHIFT (Dongdong Zhang)

- Correct documentation of the 'config_acs=' kernel parameter (Akihiko
  Odaki)

* pci/misc:
  Documentation: Fix pci=config_acs= example
  PCI: Remove redundant PCI_VSEC_HDR and PCI_VSEC_HDR_LEN_SHIFT
  PCI: Don't include 'pm_wakeup.h' directly
  PCI: Update code comment on PCI_EXP_LNKCAP_SLS for PCIe r3.0
  PCI/ACPI: Constify 'struct bin_attribute'
  PCI/P2PDMA: Constify 'struct bin_attribute'
  PCI/VPD: Constify 'struct bin_attribute'
  PCI/sysfs: Constify 'struct bin_attribute'
2025-01-23 13:05:06 -06:00
Dongdong Zhang 188973e353 PCI: Remove redundant PCI_VSEC_HDR and PCI_VSEC_HDR_LEN_SHIFT
Remove duplicate macro PCI_VSEC_HDR and its related macro
PCI_VSEC_HDR_LEN_SHIFT from pci_regs.h to avoid redundancy and
inconsistencies. Update VFIO PCI code to use PCI_VNDR_HEADER and
PCI_VNDR_HEADER_LEN() for consistent naming and functionality.

These changes aim to streamline header handling while minimizing impact,
given the niche usage of these macros in userspace.

Link: https://lore.kernel.org/r/20241216013536.4487-1-zhangdongdong@eswincomputing.com
Signed-off-by: Dongdong Zhang <zhangdongdong@eswincomputing.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
2025-01-21 17:29:11 -06:00
Ilpo Järvinen ad41ddeeac PCI: Add TLP Prefix reading to pcie_read_tlp_log()
pcie_read_tlp_log() handles only 4 Header Log DWORDs but TLP Prefix Log
(PCIe r6.1 secs 7.8.4.12 & 7.9.14.13) may also be present.

Generalize pcie_read_tlp_log() and struct pcie_tlp_log to also handle TLP
Prefix Log. The relevant registers are formatted identically in AER and DPC
Capability, but has these variations:

  a) The offsets of TLP Prefix Log registers vary.

  b) DPC RP PIO TLP Prefix Log register can be < 4 DWORDs.

  c) AER TLP Prefix Log Present (PCIe r6.1 sec 7.8.4.7) can indicate Prefix
     Log is not present.

Therefore callers must pass the offset of the TLP Prefix Log register and
the entire length to pcie_read_tlp_log() to be able to read the correct
number of TLP Prefix DWORDs from the correct offset.

Link: https://lore.kernel.org/r/20250114170840.1633-8-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
[bhelgaas: squash ternary fix from
https://lore.kernel.org/r/20250116172019.88116-1-colin.i.king@gmail.com]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2025-01-16 12:04:38 -06:00
Ilpo Järvinen e5321ae10e PCI: Store number of supported End-End TLP Prefixes
eetlp_prefix_path in the struct pci_dev tells if End-End TLP Prefixes
are supported by the path or not, and the value is only calculated if
CONFIG_PCI_PASID is set.

The Max End-End TLP Prefixes field in the Device Capabilities Register 2
also tells how many (1-4) End-End TLP Prefixes are supported (PCIe r6.2 sec
7.5.3.15). The number of supported End-End Prefixes is useful for reading
correct number of DWORDs from TLP Prefix Log register in AER capability
(PCIe r6.2 sec 7.8.4.12).

Replace eetlp_prefix_path with eetlp_prefix_max and determine the number of
supported End-End Prefixes regardless of CONFIG_PCI_PASID so that an
upcoming commit generalizing TLP Prefix Log register reading does not have
to read extra DWORDs for End-End Prefixes that never will be there.

The value stored into eetlp_prefix_max is directly derived from device's
Max End-End TLP Prefixes and does not consider limitations imposed by
bridges or the Root Port beyond supported/not supported flags. This is
intentional for two reasons:

  1) PCIe r6.2 spec sections 2.2.10.4 & 6.2.4.4 indicate that a TLP is
     malformed only if the number of prefixes exceed the number of Max
     End-End TLP Prefixes, which seems to be the case even if the device
     could never receive that many prefixes due to smaller maximum imposed
     by a bridge or the Root Port. If TLP parsing is later added, this
     distinction is significant in interpreting what is logged by the TLP
     Prefix Log registers and the value matching to the Malformed TLP
     threshold is going to be more useful.

  2) TLP Prefix handling happens autonomously on a low layer and the value
     in eetlp_prefix_max is not programmed anywhere by the kernel (i.e.,
     there is no limiter OS can control to prevent sending more than N TLP
     Prefixes).

Link: https://lore.kernel.org/r/20250114170840.1633-7-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
2025-01-14 17:47:39 -06:00
Lukas Wunner e10c5cbd1c PCI: Update code comment on PCI_EXP_LNKCAP_SLS for PCIe r3.0
Niklas notes that the code comment on the PCI_EXP_LNKCAP_SLS macro is
outdated as it reflects the meaning of the field prior to PCIe r3.0.
Update it to avoid confusion.

Closes: https://lore.kernel.org/r/70829798889c6d779ca0f6cd3260a765780d1369.camel@kernel.org
Link: https://lore.kernel.org/r/6152bd17cbe0876365d5f4624fc317529f4bbc85.1734376438.git.lukas@wunner.de
Reported-by: Niklas Schnelle <niks@kernel.org>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
2024-12-19 00:09:01 +00:00
Bjorn Helgaas ab02bafcec Merge branch 'pci/tph'
- Add and document TLP Processing Hints (TPH) support so drivers can enable
  and disable TPH and the kernel can save/restore TPH configuration (Wei
  Huang)

- Add TPH Steering Tag support so drivers can retrieve Steering Tag values
  associated with specific CPUs via an ACPI _DSM to direct DMA writes
  closer to their consumers (Wei Huang)

* pci/tph:
  PCI/TPH: Add TPH documentation
  PCI/TPH: Add Steering Tag support
  PCI: Add TLP Processing Hints (TPH) support
2024-11-25 13:40:55 -06:00
Ilpo Järvinen d2bd39c045 PCI: Store all PCIe Supported Link Speeds
The PCIe bandwidth controller added by a subsequent commit will require
selecting PCIe Link Speeds that are lower than the Maximum Link Speed.

The struct pci_bus only stores max_bus_speed. Even if PCIe r6.1 sec 8.2.1
currently disallows gaps in supported Link Speeds, the Implementation Note
in PCIe r6.1 sec 7.5.3.18, recommends determining supported Link Speeds
using the Supported Link Speeds Vector in the Link Capabilities 2 Register
(when available) to "avoid software being confused if a future
specification defines Links that do not require support for all slower
speeds."

Reuse code in pcie_get_speed_cap() to add pcie_get_supported_speeds() to
query the Supported Link Speeds Vector of a PCIe device. The value is taken
directly from the Supported Link Speeds Vector or synthesized from the Max
Link Speed in the Link Capabilities Register when the Link Capabilities 2
Register is not available.

The Supported Link Speeds Vector in the Link Capabilities Register 2
corresponds to the bus below on Root Ports and Downstream Ports, whereas it
corresponds to the bus above on Upstream Ports and Endpoints (PCIe r6.1 sec
7.5.3.18):

  Supported Link Speeds Vector - This field indicates the supported Link
  speed(s) of the associated Port.

Add supported_speeds into the struct pci_dev that caches the
Supported Link Speeds Vector.

supported_speeds contains a set of Link Speeds only in the case where PCIe
Link Speed can be determined. Root Complex Integrated Endpoints do not have
a well-defined Link Speed because they do not implement either of the Link
Capabilities Registers, which is allowed by PCIe r6.1 sec 7.5.3 (the same
limitation applies to determining cur_bus_speed and max_bus_speed that are
PCI_SPEED_UNKNOWN in such case). This is of no concern from PCIe bandwidth
controller point of view because such devices are not attached into a PCIe
Root Port that could be controlled.

The supported_speeds field keeps the extra reserved zero at the least
significant bit to match the Link Capabilities 2 Register layout.

An attempt was made to store supported_speeds field into the struct pci_bus
as an intersection of both ends of the Link, however, the subordinate
struct pci_bus is not available early enough. The Target Speed quirk (in
pcie_failed_link_retrain()) can run either during initial scan or later,
requiring it to use the API provided by the PCIe bandwidth controller to
set the Target Link Speed in order to co-exist with the bandwidth
controller. When the Target Speed quirk is calling the bandwidth controller
during initial scan, the struct pci_bus is not yet initialized. As such,
storing supported_speeds into the struct pci_bus is not viable.

Suggested-by: Lukas Wunner <lukas@wunner.de>
Link: https://lore.kernel.org/r/20241018144755.7875-4-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
[bhelgaas: move pcie_get_supported_speeds() decl to drivers/pci/pci.h]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-11-11 14:19:30 -06:00
Wei Huang f69767a1ad PCI: Add TLP Processing Hints (TPH) support
Add support for PCIe TLP Processing Hints (TPH) support (see PCIe r6.2,
sec 6.17).

Add TPH register definitions in pci_regs.h, including the TPH Requester
capability register, TPH Requester control register, TPH Completer
capability, and the ST fields of MSI-X entry.

Introduce pcie_enable_tph() and pcie_disable_tph(), enabling drivers to
toggle TPH support and configure specific ST mode as needed. Also add a new
kernel parameter, "pci=notph", allowing users to disable TPH support across
the entire system.

Link: https://lore.kernel.org/r/20241002165954.128085-2-wei.huang2@amd.com
Co-developed-by: Jing Liu <jing2.liu@intel.com>
Co-developed-by: Paul Luse <paul.e.luse@linux.intel.com>
Co-developed-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Jing Liu <jing2.liu@intel.com>
Signed-off-by: Paul Luse <paul.e.luse@linux.intel.com>
Signed-off-by: Eric Van Tassell <Eric.VanTassell@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
2024-10-02 16:20:01 -05:00
Bjorn Helgaas 9d4f1c0747 Merge branch 'pci/npem'
- Initialize leds class earlier (with an unfortunate Makefile ordering
  change) so the PCI NPEM driver can use it (Mariusz Tkaczyk)

- Add Native PCIe Enclosure Management (NPEM) support for sysfs control of
  NVMe RAID storage indicators (ok/fail/locate/rebuild/etc) (Mariusz
  Tkaczyk)

- Add support for the ACPI _DSM PCIe SSD status LED management, which is
  functionally similar to NPEM but mediated by platform firmware (Mariusz
  Tkaczyk)

* pci/npem:
  PCI/NPEM: Add _DSM PCIe SSD status LED management
  PCI/NPEM: Add Native PCIe Enclosure Management support
  leds: Init leds class earlier
2024-09-19 14:25:26 -05:00
Bjorn Helgaas 87f10faf16 PCI: Rename CRS Completion Status to RRS
PCIe r6.0 changed the abbreviation for "Configuration Request Retry Status"
Completion Status from "CRS" to "RRS" and uses the terminology of
"Configuration RRS Software Visibility" instead of "CRS Software
Visibility".

Align the Linux usage with the r6.0 spec language.  No functional change
intended.

It's confusing to make this change, but I think "RRS" *is* a better
abbreviation because it was easy to interpret "CRS" as "Completion Retry
Status", which really didn't make any sense.

Link: https://lore.kernel.org/r/20240827234848.4429-4-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-09-10 19:52:30 -05:00
Mariusz Tkaczyk 4e893545ef PCI/NPEM: Add Native PCIe Enclosure Management support
Native PCIe Enclosure Management (NPEM, PCIe r6.1 sec 6.28) allows managing
LEDs in storage enclosures. NPEM is indication oriented and it does not
give direct access to LEDs. Although each indication *could* represent an
individual LED, multiple indications could also be represented as a single,
multi-color LED or a single LED blinking in a specific interval.  The
specification leaves that open.

Each enabled indication (capability register bit on) is represented as a
ledclass_dev which can be controlled through sysfs. For every ledclass
device only 2 brightness states are allowed: LED_ON (1) or LED_OFF (0).
This corresponds to the NPEM control register (Indication bit on/off).

Ledclass devices appear in sysfs as child devices (subdirectory) of PCI
device which has an NPEM Extended Capability and indication is enabled in
NPEM capability register. For example, these are LEDs created for pcieport
"10000:02:05.0" on my setup:

  leds/
  ├── 10000:02:05.0:enclosure:fail
  ├── 10000:02:05.0:enclosure:locate
  ├── 10000:02:05.0:enclosure:ok
  └── 10000:02:05.0:enclosure:rebuild

They can be also found in "/sys/class/leds" directory. The parent PCIe
device domain/bus/device/function address is used to guarantee uniqueness
across leds subsystem.

To enable/disable a "fail" indication, the "brightness" file can be edited:

  echo 1 > ./leds/10000:02:05.0:enclosure:fail/brightness
  echo 0 > ./leds/10000:02:05.0:enclosure:fail/brightness

PCIe r6.1, sec 7.9.19.2 defines the possible indications.

Multiple indications for same parent PCIe device can conflict and hardware
may update them when processing new request. To avoid issues, driver
refresh all indications by reading back control register.

This driver expects to be the exclusive NPEM extended capability manager.
It waits up to 1 second after imposing new request, it doesn't verify if
controller is busy before write, and it assumes the mutex lock gives
protection from concurrent updates.

If _DSM LED management is available, we assume the platform may be using
NPEM for its own purposes (see PCI Firmware Spec r3.3 sec 4.7), so the
driver does not use NPEM. A future patch will add _DSM support; an info
message notes whether NPEM or _DSM is being used.

NPEM is a PCIe extended capability so it should be registered in
pcie_init_capabilities() but it is not possible due to LED dependency.  The
parent pci_device must be added earlier for led_classdev_register() to be
successful. NPEM does not require configuration on kernel side, so it is
safe to register LED devices later.

Link: https://lore.kernel.org/r/20240904104848.23480-3-mariusz.tkaczyk@linux.intel.com
Suggested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Stuart Hayes <stuart.w.hayes@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-09-04 17:25:12 -05:00
Bjorn Helgaas d539117564 Merge branch 'pci/doe'
- Add support for DOE Discovery version 2 (Alexey Kardashevskiy)

* pci/doe:
  PCI/DOE: Support discovery version 2
2024-05-16 18:14:09 -05:00
Dave Jiang b1956e2d07 PCI/CXL: Fail bus reset if upstream CXL Port has SBR masked
Per CXL spec r3.1, sec 8.1.5.2, the Secondary Bus Reset (SBR) bit in the
Bridge Control register of a CXL port has no effect unless the "Unmask SBR"
bit is set.

Return -ENOTTY if we attempt a bus reset on a device below a CXL Port where
"Unmask SBR" is 0.  Otherwise, the bus reset would appear to have succeeded
even though setting the bridge SBR bit had no effect.

Link: https://lore.kernel.org/linux-cxl/20240220203956.GA1502351@bhelgaas/
Link: https://lore.kernel.org/r/20240502165851.1948523-4-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
[bhelgaas: simplify commit log and comments]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
2024-05-08 13:25:36 -05:00
Alexey Kardashevskiy eebab7e3eb PCI/DOE: Support discovery version 2
PCIe r6.1, sec 6.30.1.1 defines a "DOE Discovery Version" field in
the DOE Discovery Request Data Object Contents (3rd DW) as:

15:8 DOE Discovery Version – must be 02h if the Capability Version in
the Data Object Exchange Extended Capability is 02h or greater.

Add support for the version on devices with the DOE v2 capability.

Link: https://lore.kernel.org/r/20240307022006.3657433-1-aik@amd.com
Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
2024-04-09 09:33:15 -05:00
Bjorn Helgaas 5897c17402 Merge branch 'pci/field-get'
- Use FIELD_GET()/FIELD_PREP() when possible throughout drivers/pci/ (Ilpo
  Järvinen, Bjorn Helgaas)

- Rework DPC control programming for clarity (Ilpo Järvinen)

* pci/field-get:
  PCI/portdrv: Use FIELD_GET()
  PCI/VC: Use FIELD_GET()
  PCI/PTM: Use FIELD_GET()
  PCI/PME: Use FIELD_GET()
  PCI/ATS: Use FIELD_GET()
  PCI/ATS: Show PASID Capability register width in bitmasks
  PCI: Use FIELD_GET() in Sapphire RX 5600 XT Pulse quirk
  PCI: Use FIELD_GET()
  PCI/MSI: Use FIELD_GET/PREP()
  PCI/DPC: Use defines with DPC reason fields
  PCI/DPC: Use defined fields with DPC_CTL register
  PCI/DPC: Use FIELD_GET()
  PCI: hotplug: Use FIELD_GET/PREP()
  PCI: dwc: Use FIELD_GET/PREP()
  PCI: cadence: Use FIELD_GET()
  PCI: Use FIELD_GET() to extract Link Width
  PCI: mvebu: Use FIELD_PREP() with Link Width
  PCI: tegra194: Use FIELD_GET()/FIELD_PREP() with Link Width fields

# Conflicts:
#	drivers/pci/controller/dwc/pcie-tegra194.c
2023-10-28 13:31:05 -05:00
Bjorn Helgaas 553b84bf46 Merge branch 'pci/enumeration'
- Add and use pci_get_base_class() to search for all PCI_BASE_CLASS_DISPLAY
  devices (Sui Jingfeng)

- Fix a vmd check for multi-function devices (Ilpo Järvinen)

- Add PCI_HEADER_TYPE_MFD and use it to replace literals (Ilpo Järvinen)

- Use acpi_evaluate_dsm_typed() instead of open-coding it (Andy Shevchenko)

- Keep .remove() and .probe() callbacks (previously marked __init) in case
  they're used via sysfs (Uwe Kleine-König)

* pci/enumeration:
  PCI: keystone: Don't discard .probe() callback
  PCI: keystone: Don't discard .remove() callback
  PCI: kirin: Don't discard .remove() callback
  PCI: exynos: Don't discard .remove() callback
  PCI/ACPI: Use acpi_evaluate_dsm_typed()
  PCI: Use PCI_HEADER_TYPE_* instead of literals
  PCI: Add PCI_HEADER_TYPE_MFD definition
  PCI: vmd: Correct PCI Header Type Register's multi-function check
  drm/radeon: Use pci_get_base_class() to reduce duplicated code
  drm/amdgpu: Use pci_get_base_class() to reduce duplicated code
  drm/nouveau: Use pci_get_base_class() to reduce duplicated code
  ALSA: hda: Use pci_get_base_class() to reduce duplicated code
  PCI: Add pci_get_base_class() helper
2023-10-28 13:30:58 -05:00
Bjorn Helgaas ec302b118a PCI/PME: Use FIELD_GET()
Use FIELD_GET() to remove dependences on the field position, i.e., the
shift value.  No functional change intended.

Link: https://lore.kernel.org/r/20231010204436.1000644-8-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
2023-10-24 16:55:45 -05:00
Bjorn Helgaas e0701bd0e6 PCI/ATS: Use FIELD_GET()
Use FIELD_GET() to remove dependences on the field position, i.e., the
shift value.  No functional change intended.

Link: https://lore.kernel.org/r/20231010204436.1000644-6-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
2023-10-24 16:55:45 -05:00
Bjorn Helgaas d30fea2584 PCI/ATS: Show PASID Capability register width in bitmasks
The PASID Capability and Control registers are both 16 bits wide.  Use
16-bit wide constants in field names to match the register width.  No
functional change intended.

Link: https://lore.kernel.org/r/20231010204436.1000644-5-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
2023-10-24 16:55:45 -05:00
Ilpo Järvinen 74f0b5ffe1 PCI/DPC: Use defines with DPC reason fields
Add new defines for DPC reason fields and use them instead of literals.

Link: https://lore.kernel.org/r/20231018113254.17616-7-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
[bhelgaas: shorten comments]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-10-24 10:54:04 -05:00
Bjorn Helgaas 9a9eec4765 PCI/DPC: Use FIELD_GET()
Use FIELD_GET() to remove dependencies on the field position, i.e., the
shift value. No functional change intended.

Link: https://lore.kernel.org/r/20231018113254.17616-5-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-10-24 10:54:04 -05:00
Ilpo Järvinen 92af77ca26 PCI: dwc: Use FIELD_GET/PREP()
Convert open-coded variants of PCI field access into FIELD_GET/PREP()
to make the code easier to understand.

Add two missing defines into pci_regs.h. Logically, the Max No-Snoop
Latency Register is a separate word sized register in the PCIe spec,
but the pre-existing LTR defines in pci_regs.h with dword long values
seem to consider the registers together (the same goes for the only
user). Thus, follow the custom and make the new values also take both
word long LTR registers as a joint dword register.

Link: https://lore.kernel.org/r/20231024110336.26264-1-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-10-24 10:53:58 -05:00
Ilpo Järvinen 1a11074be2 PCI: Add PCI_L1SS_CTL2 fields
Add L1 PM Substates Control 2 Register fields (PCI_L1SS_CTL2_*).

Link: https://lore.kernel.org/r/20230915155752.84640-2-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-10-10 16:03:51 -05:00
Ilpo Järvinen bdca03a271 PCI: Add PCI_HEADER_TYPE_MFD definition
Add PCI_HEADER_TYPE_MFD so we can replace literals in the code.

Link: https://lore.kernel.org/r/20231003125300.5541-3-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-10-03 11:55:15 -05:00
Ben Dooks 0b3dee602a PCI: Add PCI_EXT_CAP_ID_PL_32GT define
Add the define for PCI_EXT_CAP_ID_PL_32GT for drivers that will want this
whilst doing Gen5/Gen6 accesses.

Link: https://lore.kernel.org/r/20230531095713.293229-1-ben.dooks@codethink.co.uk
Signed-off-by: Ben Dooks <ben.dooks@sifive.com>
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-05-31 16:34:38 -05:00
Dave Jiang 248529edc8 cxl: add RAS status unmasking for CXL
By default the CXL RAS mask registers bits are defaulted to 1's and
suppress all error reporting. If the kernel has negotiated ownership
of error handling for CXL then unmask the mask registers by writing 0s.

PCI_EXP_DEVCTL capability is checked to see uncorrectable or correctable
errors bits are set before unmasking the respective errors.

Acked-by: Bjorn Helgaas <bhelgaas@google.com>  # pci_regs.h
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/167639402301.778884.12556849214955646539.stgit@djiang5-mobl3.local
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-02-14 14:12:54 -08:00
Linus Torvalds c7020e1b34 Merge tag 'pci-v6.2-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI updates from Bjorn Helgaas:
 "Enumeration:

   - Squash portdrv_{core,pci}.c into portdrv.c to ease maintenance and
     make more things static.

   - Make portdrv bind to Switch Ports that have AER. Previously, if
     these Ports lacked MSI/MSI-X, portdrv failed to bind, which meant
     the Ports couldn't be suspended to low-power states. AER on these
     Ports doesn't use interrupts, and the AER driver doesn't need to
     claim them.

   - Assign PCI domain IDs using ida_alloc(), which makes host bridge
     add/remove work better.

  Resource management:

   - To work better with recent BIOSes that use EfiMemoryMappedIO for
     PCI host bridge apertures, remove those regions from the E820 map
     (E820 entries normally prevent us from allocating BARs). In v5.19,
     we added some quirks to disable E820 checking, but that's not very
     maintainable. EfiMemoryMappedIO means the OS needs to map the
     region for use by EFI runtime services; it shouldn't prevent OS
     from using it.

  PCIe native device hotplug:

   - Build pciehp by default if USB4 is enabled, since Thunderbolt/USB4
     PCIe tunneling depends on native PCIe hotplug.

   - Enable Command Completed Interrupt only if supported to avoid user
     confusion from lspci output that says this is enabled but not
     supported.

   - Prevent pciehp from binding to Switch Upstream Ports; this happened
     because of interaction with acpiphp and caused devices below the
     Upstream Port to disappear.

  Power management:

   - Convert AGP drivers to generic power management. We hope to remove
     legacy power management from the PCI core eventually.

  Virtualization:

   - Fix pci_device_is_present(), which previously always returned
     "false" for VFs, causing virtio hangs when unbinding the driver.

  Miscellaneous:

   - Convert drivers to gpiod API to prepare for dropping some legacy
     code.

   - Fix DOE fencepost error for the maximum data object length.

  Baikal-T1 PCIe controller driver:

   - Add driver and DT bindings.

  Broadcom STB PCIe controller driver:

   - Enable Multi-MSI.

   - Delay 100ms after PERST# deassert to allow power and clocks to
     stabilize.

   - Configure Read Completion Boundary to 64 bytes.

  Freescale i.MX6 PCIe controller driver:

   - Initialize PHY before deasserting core reset to fix a regression in
     v6.0 on boards where the PHY provides the reference.

   - Fix imx6sx and imx8mq clock names in DT schema.

  Intel VMD host bridge driver:

   - Fix Secondary Bus Reset on VMD bridges, which allows reset of NVMe
     SSDs in VT-d pass-through scenarios.

   - Disable MSI remapping, which gets re-enabled by firmware during
     suspend/resume.

  MediaTek PCIe Gen3 controller driver:

   - Add MT7986 and MT8195 support.

  Qualcomm PCIe controller driver:

   - Add SC8280XP/SA8540P basic interconnect support.

  Rockchip DesignWare PCIe controller driver:

   - Base DT schema on common Synopsys schema.

  Synopsys DesignWare PCIe core:

   - Collect DT items shared between Root Port and Endpoint (PERST GPIO,
     PHY info, clocks, resets, link speed, number of lanes, number of
     iATU windows, interrupt info, etc) to snps,dw-pcie-common.yaml.

   - Add dma-ranges support for Root Ports and Endpoints.

   - Consolidate DT resource retrieval for "dbi", "dbi2", "atu", etc. to
     reduce code duplication.

   - Add generic names for clocks and resets to encourage more
     consistent naming across drivers using DesignWare IP.

   - Stop advertising PTM Responder role for Endpoints, which aren't
     allowed to be responders.

  TI J721E PCIe driver:

   - Add j721s2 host mode ID to DT schema.

   - Add interrupt properties to DT schema.

  Toshiba Visconti PCIe controller driver:

   - Fix interrupts array max constraints in DT schema"

* tag 'pci-v6.2-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (95 commits)
  x86/PCI: Use pr_info() when possible
  x86/PCI: Fix log message typo
  x86/PCI: Tidy E820 removal messages
  PCI: Skip allocate_resource() if too little space available
  efi/x86: Remove EfiMemoryMappedIO from E820 map
  PCI/portdrv: Allow AER service only for Root Ports & RCECs
  PCI: xilinx-nwl: Fix coding style violations
  PCI: mvebu: Switch to using gpiod API
  PCI: pciehp: Enable Command Completed Interrupt only if supported
  PCI: aardvark: Switch to using devm_gpiod_get_optional()
  dt-bindings: PCI: mediatek-gen3: add support for mt7986
  dt-bindings: PCI: mediatek-gen3: add SoC based clock config
  dt-bindings: PCI: qcom: Allow 'dma-coherent' property
  PCI: mt7621: Add sentinel to quirks table
  PCI: vmd: Fix secondary bus reset for Intel bridges
  PCI: endpoint: pci-epf-vntb: Fix sparse ntb->reg build warning
  PCI: endpoint: pci-epf-vntb: Fix sparse build warning for epf_db
  PCI: endpoint: pci-epf-vntb: Replace hardcoded 4 with sizeof(u32)
  PCI: endpoint: pci-epf-vntb: Remove unused epf_db_phy struct member
  PCI: endpoint: pci-epf-vntb: Fix call pci_epc_mem_free_addr() in error path
  ...
2022-12-14 09:54:10 -08:00
Ira Weiny 487d828d75 cxl/doe: Request exclusive DOE access
The PCIE Data Object Exchange (DOE) mailbox is a protocol run over
configuration cycles.  It assumes one initiator at a time.  While the
kernel has control of the mailbox user space writes could interfere with
the kernel access.

Mark DOE mailbox config space exclusive when iterated by the CXL driver.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20220926215711.2893286-3-ira.weiny@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-11-14 10:07:22 -08:00
Vidya Sagar e32e1e26c4 PCI: Add PCI_PTM_CAP_RES macro
Add macro defining Responder capable bit in Precision Time Measurement
capability register.

Link: https://lore.kernel.org/r/20220919143340.4527-2-vidyas@nvidia.com
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Jingoo Han <jingoohan1@gmail.com>
2022-10-27 14:40:59 +02:00
Jonathan Cameron 9d24322e88 PCI/DOE: Add DOE mailbox support functions
Introduced in a PCIe r6.0, sec 6.30, DOE provides a config space based
mailbox with standard protocol discovery.  Each mailbox is accessed
through a DOE Extended Capability.

Each DOE mailbox must support the DOE discovery protocol in addition to
any number of additional protocols.

Define core PCIe functionality to manage a single PCIe DOE mailbox at a
defined config space offset.  Functionality includes iterating,
creating, query of supported protocol, and task submission.  Destruction
of the mailboxes is device managed.

Cc: "Li, Ming" <ming4.li@intel.com>
Cc: Bjorn Helgaas <helgaas@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Acked-by: Bjorn Helgaas <helgaas@kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Co-developed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20220719205249.566684-4-ira.weiny@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-07-19 15:38:04 -07:00
Pali Rohár e8e7fbb6a3 PCI: Add PCI_EXP_SLTCTL_ASPL_DISABLE macro
Add macro defining Auto Slot Power Limit Disable bit in Slot Control
Register.

Link: https://lore.kernel.org/r/20220412094946.27069-2-pali@kernel.org
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2022-04-25 10:53:38 +01:00