# Data Source

## Create data source

POST /v2/team/datasets/{id}/datasources

Create a data source in a specified dataset. You can only create data sources in datasets that you have created.

The data source can be in one of the following formats: **.csv**, **.tsv**, **.md**, **.mdx**, **.json**, **.txt**, **.pdf**, **.pptx**, **.ppt**, **.doc**, **.docx**, **.xls**, or **.xlsx**.

> Body Request Parameters

```json
{
  "name": "test.csv",
  "type": "FILE",
  "user_id": "tmm-dsfasdfasdfa",
  "url": "https://s3.amazonaws.com/xxxtest/user/clvl4cad2001q01l1m522hxlu/upload/f9773f1e-cd68-489a-8121-d566ca9218b1.csv?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20240924T143419Z&X-Amz-SignedHeaders=host&X-Amz-Expires=599&X-Amz-Credential=AKIARLSQLXURHEIDN4OZ%2F20240924%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Signature=9ca0c58d508926a5811818041d557ffb53c64025dae94c0855280d457c7089a2"
}
```

### Request Parameters

|Name|Location|Type|Required|Chinese Name|Description|
|---|---|---|---|---|---|
|id|path|string| Yes ||Target dataset ID.|
|x-pd-external-trace-id|header|string| No ||Trace ID set in your local system, supports up to 128 characters. If an error occurs in the request, provide this ID to the MAXIR AI team for troubleshooting.|
|body|body|object| No ||none|
|» name|body|string| Yes ||Data source name, must include the file extension (e.g., `example.csv`), supports up to 128 characters. If it exceeds this limit, the name will be truncated for display.|
|» type|body|string| Yes | FILE|The type of the data source. Set to **FILE**.|
|» url|body|string| No ||The file URL for public access.|
|» file_object_key|body|string| No ||The object storage path of the file uploaded locally.|
|» user_id|body|string| Yes ||User ID, which is your unique identity in the organization.|

#### Detailed Explanation

**id**: Target dataset ID.

To query the list of datasets you have access to, call the [GET /v2/team/datasets](/docs/maxirai/API/api-reference/datasets#list-datasets) interface.

**» url**: File URL for public access.

Either `url` or `file_object_key` must be specified, but not both.

Only files with the following extensions are supported: .csv, .tsv, .md, .mdx, .json, .txt, .pdf, .pptx, .ppt, .doc, .docx, .xls, or .xlsx.

**» file_object_key**: The object storage path of the file uploaded locally.

Either `url` or `file_object_key` must be specified, but not both.

Supported file extensions include: **.csv**, **.tsv**, **.md**, **.mdx**, **.json**, **.txt**, **.pdf**, **.pptx**, **.ppt**, **.doc**, **.docx**, **.xls**, or **.xlsx**.

How to obtain the file's `file_object_key`:

When using the [Upload file](/docs/maxirai/API/api-reference/file#upload-local-file) interface to complete the file upload, the `file_object_key` of the file will be returned.

> Example Response

```json
{
  "code": 0,
  "data": {
    "id": "datasource-cadsgfsdagasgadsg",
    "dataset_id": "dataset-dagasdgasgasg",
    "name": "test.csv",
    "type": "FILE",
    "status": "synching"
  }
}
```

### Response

|Status Code|Status Code Meaning|Description|Data Model|
|---|---|---|---|
|200|[OK](https://tools.ietf.org/html/rfc7231#section-6.3.1)|none|Inline|

### Response Data Structure

Status Code **200**

|Name|Type|Required|Constraint|Chinese Name|Description|
|---|---|---|---|---|---|
|» code|integer|true|none||Status code. **0** indicates a successful operation. Other values indicate an operation failure. For troubleshooting, refer to [Error Codes](/docs/maxirai/API/error-codes).|
|» data|object|true|none||Data source object.|
|»» id|string|true|none||Data source ID, which is the unique identifier of this data source in the dataset.|
|»» dataset_id|string|true|none||The ID of the dataset to which the data source belongs.|
|»» name|string|true|none||Data source name.|
|»» type|string|true|none||Data source type, fixed as **FILE**.|
|»» status|string|true|none||Processing status of the data source. Possible values are:<br /><br />- `invalid`: Pending processing.<br />- `synching`: Processing.<br />- `synched`: Successfully synchronized.|

#### Enumeration Values

|Property|Value|
|---|---|
|type|FILE|
|status|synching|
|status|synched|
|status|invalid|

### Response Header

|Status|Header|Type|Format|Description|
|---|---|---|---|---|
|200|x-pd-trace-id|string||Trace ID returned by MAXIR AI. In case of an error in the request, this ID can be provided to the MAXIR AI team for troubleshooting.|

## List data sources

GET /v2/team/datasets/{id}/datasources

Return the list of data sources in the specified dataset. When using this interface, please note:

- Ensure the specified dataset and your API Key belong to the same project.
- To view datasets you have access to within the project, call the [GET /v2/team/datasets](/docs/maxirai/API/api-reference/datasets#list-datasets) interface.

### Request Parameters

|Name|Location|Type|Required|Chinese Name|Description|
|---|---|---|---|---|---|
|id|path|string| Yes ||Target dataset ID.|
|page_number|query|integer| No ||Start page number for paginated results. If not specified, the default value `1` is used.|
|page_size|query|integer| No ||Number of records returned per page. If not specified, the default value `10` is used.|
|status|query|string| No ||Data source status. If this parameter is specified, only data sources in the specified status will be returned. Optional values include:|
|user_id|query|string| Yes ||User ID, which is your unique identity in the organization.|
|x-pd-external-trace-id|header|string| No ||Trace ID set in your local system, supports up to 128 characters. If an error occurs in the request, provide this ID to the MAXIR AI team for troubleshooting.|

#### Detailed Explanation

**id**: Target dataset ID.

To query the list of datasets you have access to, call the [GET /v2/team/datasets](/docs/maxirai/API/api-reference/datasets#list-datasets) interface.

**status**: Data source status. If this parameter is specified, only data sources in the specified status will be returned. Optional values include:

- `invalid`: Pending processing.
- `synching`: Processing.
- `synched`: Successfully synchronized.

If not specified, all data sources will be returned.

Multiple statuses can be specified as a comma-separated list, and any matching status's data source will be returned.

#### Enumeration Values

|Property|Value|
|---|---|
|status|synching|
|status|invalid|
|status|synched|

> Example Response

```json
{
  "code": 0,
  "data": {
    "total_items": 1,
    "page_size": 10,
    "page_number": 1,
    "records": [
      {
        "id": "datasource-cadsgfsdagasgadsg",
        "dataset_id": "dataset-dagasdgasgasg",
        "name": "test.csv",
        "type": "FILE",
        "status": "synching"
      }
    ]
  }
}
```

### Response

|Status Code|Status Code Meaning|Description|Data Model|
|---|---|---|---|
|200|[OK](https://tools.ietf.org/html/rfc7231#section-6.3.1)|none|Inline|

### Response Data Structure

Status Code **200**

|Name|Type|Required|Constraint|Chinese Name|Description|
|---|---|---|---|---|---|
|» code|integer|true|none||Status code. **0** indicates a successful operation. Other values indicate an operation failure. For troubleshooting, refer to [Error Codes](/docs/maxirai/API/error-codes).|
|» data|object|true|none||Paginated list of data sources.|
|»» total_items|integer|true|none||Total number of data sources returned.|
|»» page_size|integer|true|none||Number of data sources returned per page.|
|»» page_number|integer|true|none||Page number of the current page.|
|»» records|object|true|none||List of data sources returned on the current page.|
|»»» id|string|true|none||Data source ID, which is the unique identifier of this data source in the dataset.|
|»»» dataset_id|string|true|none||The ID of the dataset to which the data source belongs.|
|»»» name|string|true|none||Data source name.|
|»»» type|string|true|none||Data source type, fixed as **FILE**.|
|»»» status|string|true|none||Processing status of the data source. Possible values are:<br /><br />- `invalid`: Pending processing.<br />- `synching`: Processing.<br />- `synched`: Successfully synchronized.|

#### Enumeration Values

|Property|Value|
|---|---|
|type|FILE|
|status|synching|
|status|synched|
|status|invalid|

### Response Header

|Status|Header|Type|Format|Description|
|---|---|---|---|---|
|200|x-pd-trace-id|string||Trace ID returned by MAXIR AI. In case of an error in the request, this ID can be provided to the MAXIR AI team for troubleshooting.|

## Delete data source

DELETE /v2/team/datasets/{dataset_id}/datasources/{datasource_id}

Delete a data source from the specified dataset. Once deleted, the data source cannot be recovered.

You can only delete data sources in your own datasets.

> Body Request Parameters

```json
{
  "user_id": "tmm-dafasdfasdfasdf"
}
```

### Request Parameters

|Name|Location|Type|Required|Chinese Name|Description|
|---|---|---|---|---|---|
|dataset_id|path|string| Yes ||Target dataset ID.|
|datasource_id|path|string| Yes ||ID of the data source to delete.|
|x-pd-external-trace-id|header|string| No ||Trace ID set in your local system, supports up to 128 characters. If an error occurs in the request, provide this ID to the MAXIR AI team for troubleshooting.|
|body|body|object| No ||none|
|» user_id|body|string| Yes ||User ID, which is your unique identity in the organization.|

#### Detailed Explanation

**dataset_id**: Target dataset ID.

To query the list of datasets you have access to, call the [GET /v2/team/datasets](/docs/maxirai/API/api-reference/datasets#list-datasets) interface.

**datasource_id**: ID of the data source to delete.

To query the data sources in a specified dataset, call the [GET /v2/team/datasets/{id}/datasources](/docs/maxirai/API/api-reference/data-source#list-data-sources) interface.

> Example Response

> 200 Response

```json
{
  "code": 0,
  "data": {}
}
```

### Response

|Status Code|Status Code Meaning|Description|Data Model|
|---|---|---|---|
|200|[OK](https://tools.ietf.org/html/rfc7231#section-6.3.1)|none|Inline|

### Response Data Structure

Status Code **200**

|Name|Type|Required|Constraint|Chinese Name|Description|
|---|---|---|---|---|---|
|» code|integer|true|none||Status code. **0** indicates a successful operation. Other values indicate an operation failure. For troubleshooting, refer to [Error Codes](/docs/maxirai/API/error-codes).|
|» data|object¦null|false|none||Returns null if the operation is successful.|

### Response Header

|Status|Header|Type|Format|Description|
|---|---|---|---|---|
|200|x-pd-trace-id|string||Trace ID returned by MAXIR AI. In case of an error in the request, this ID can be provided to the MAXIR AI team for troubleshooting.|

## Get data source

GET /v2/team/datasets/{dataset_id}/datasources/{datasource_id}

Retrieve information about a specified data source.

### Request Parameters

|Name|Location|Type|Required|Chinese Name|Description|
|---|---|---|---|---|---|
|dataset_id|path|string| Yes ||The dataset ID where the target data source is located.|
|datasource_id|path|string| Yes ||The target data source ID.|
|user_id|query|string| Yes ||User ID, which is your unique identity in the organization.|
|x-pd-external-trace-id|header|string| No ||Trace ID set in your local system, supports up to 128 characters. If an error occurs in the request, provide this ID to the MAXIR AI team for troubleshooting.|

#### Detailed Explanation

**dataset_id**: The dataset ID where the target data source is located.

To query the list of datasets you have access to, call the [GET /v2/team/datasets](/docs/maxirai/API/api-reference/datasets#list-datasets) interface.

**datasource_id**: The target data source ID.

To query the data sources in a specified dataset, call the [GET /v2/team/datasets/{id}/datasources](/docs/maxirai/API/api-reference/data-source#list-data-sources) interface.

> Example Response

```json
{
  "code": 0,
  "data": {
    "id": "datasource-cadsgfsdagasgadsg",
    "dataset_id": "dataset-dagasdgasgasg",
    "name": "test.csv",
    "type": "FILE",
    "status": "synching"
  }
}
```

### Response

|Status Code|Status Code Meaning|Description|Data Model|
|---|---|---|---|
|200|[OK](https://tools.ietf.org/html/rfc7231#section-6.3.1)|none|Inline|

### Response Data Structure

Status Code **200**

|Name|Type|Required|Constraint|Chinese Name|Description|
|---|---|---|---|---|---|
|» code|integer|true|none||Status code. **0** indicates a successful operation. Other values indicate an operation failure. For troubleshooting, refer to [Error Codes](/docs/maxirai/API/error-codes).|
|» data|object|true|none||Data source object.|
|»» id|string|true|none||Data source ID, which is the unique identifier of this data source in the dataset.|
|»» dataset_id|string|true|none||The ID of the dataset to which the data source belongs.|
|»» name|string|true|none||Data source name.|
|»» type|string|true|none||Data source type, fixed as **FILE**.|
|»» status|string|true|none||Processing status of the data source. Possible values are:<br /><br />- `invalid`: Pending processing.<br />- `synching`: Processing.<br />- `synched`: Successfully synchronized.|

#### Enumeration Values

|Property|Value|
|---|---|
|type|FILE|
|status|synching|
|status|synched|
|status|invalid|

### Response Header

|Status|Header|Type|Format|Description|
|---|---|---|---|---|
|200|x-pd-trace-id|string||Trace ID returned by MAXIR AI. In case of an error in the request, this ID can be provided to the MAXIR AI team for troubleshooting.|

## Create data source without specifying a dataset

POST /v2/team/datasources

This interface is used to create a data source directly without specifying a dataset.

When invoking this interface, MAXIR AI will automatically create a dataset for the data source. Please save the dataset ID in the response for future operations, such as [associating the dataset with a job](/docs/maxirai/API/api-reference/job#create-job) for data analysis and exploration.

> Body Request Parameters

```json
{
  "name": "test.csv",
  "type": "FILE",
  "user_id": "tmm-dafasdfasdfasdf",
  "file_object_key": "/tmp/sdgsagdsgsadgasdg"
}
```

### Request Parameters

|Name|Location|Type|Required|Chinese Name|Description|
|---|---|---|---|---|---|
|x-pd-external-trace-id|header|string| No ||Trace ID set in your local system, supports up to 128 characters. If an error occurs in the request, provide this ID to the MAXIR AI team for troubleshooting.|
|body|body|object| No ||none|

> Example Response

```json
{
  "code": 0,
  "data": {
    "id": "datasource-cadsgfsdagasgadsg",
    "dataset_id": "dataset-dagasdgasgasg",
    "name": "test.csv",
    "type": "FILE",
    "status": "synching"
  }
}
```

### Response

|Status Code|Status Code Meaning|Description|Data Model|
|---|---|---|---|
|200|[OK](https://tools.ietf.org/html/rfc7231#section-6.3.1)|none|Inline|

### Response Data Structure

Status Code **200**

|Name|Type|Required|Constraint|Chinese Name|Description|
|---|---|---|---|---|---|
|» code|integer|true|none||Status code. **0** indicates a successful operation. Other values indicate an operation failure. For troubleshooting, refer to [Error Codes](/docs/maxirai/API/error-codes).|
|» data|object|true|none||Data source object.|
|»» id|string|true|none||Data source ID, which is the unique identifier of this data source in the dataset.|
|»» dataset_id|string|true|none||The ID of the dataset to which the data source belongs.|
|»» name|string|true|none||Data source name.|
|»» type|string|true|none||Data source type, fixed as **FILE**.|
|»» status|string|true|none||Processing status of the data source. Possible values are:<br /><br />- `invalid`: Pending processing.<br />- `synching`: Processing.<br />- `synched`: Successfully synchronized.|

#### Enumeration Values

|Property|Value|
|---|---|
|type|FILE|
|status|synching|
|status|synched|
|status|invalid|

### Response Header

|Status|Header|Type|Format|Description|
|---|---|---|---|---|
|200|x-pd-trace-id|string||Trace ID returned by MAXIR AI. In case of an error in the request, this ID can be provided to the MAXIR AI team for troubleshooting.|

## Presign data source 

POST /v2/team/datasets/{dataset_id}/datasources/{datasource_id}/presign

This interface is used to generate a pre-signed URL (Presigned URL) for the specified data source, allowing you to download the corresponding data source through this URL.

The pre-signed URL has an expiration period, so be sure to complete the download of the data source before the URL expires.

> Body Request Parameters

```json
{
  "expires_in": 600,
  "user_id": "tmm-dafasdfasdfasdf"
}
```

### Request Parameters

|Name|Location|Type|Required|Chinese Name|Description|
|---|---|---|---|---|---|
|dataset_id|path|string| Yes ||The dataset ID where the target data source is located.|
|datasource_id|path|string| Yes ||The target data source ID.|
|x-pd-external-trace-id|header|string| No ||Trace ID set in your local system, supports up to 128 characters. If an error occurs in the request, provide this ID to the MAXIR AI team for troubleshooting.|
|body|body|object| No ||none|
|» expires_in|body|integer| No ||Expiration time of the pre-signed URL, in seconds (s). The minimum value is `60`, and the default value is `600`.|
|» user_id|body|string| Yes ||User ID, which is your unique identity in the organization.|

#### Detailed Explanation

**dataset_id**: The dataset ID where the target data source is located.

To query the list of datasets you have access to, call the [GET /v2/team/datasets](/docs/maxirai/API/api-reference/datasets#list-datasets) interface.

**datasource_id**: The target data source ID.

To query the data sources in a specified dataset, call the [GET /v2/team/datasets/{id}/datasources](/docs/maxirai/API/api-reference/data-source#list-data-sources) interface.

> Example Response

```json
{
  "code": 0,
  "data": {
    "presigned_url": "string",
    "expires_at": "2024-11-13T14:15:22.123Z"
  }
}
```

### Response

|Status Code|Status Code Meaning|Description|Data Model|
|---|---|---|---|
|200|[OK](https://tools.ietf.org/html/rfc7231#section-6.3.1)|none|Inline|

### Response Data Structure

Status Code **200**

|Name|Type|Required|Constraint|Chinese Name|Description|
|---|---|---|---|---|---|
|» code|integer|true|none||Status code. **0** indicates a successful operation. Other values indicate an operation failure. For troubleshooting, refer to [Error Codes](/docs/maxirai/API/error-codes).|
|» data|object|true|none||Returned data object.|
|»» presigned_url|string|true|none||Pre-signed URL for downloading the corresponding data source.|
|»» expires_at|string(date-time)|true|none||Expiration date and time of the pre-signed URL.|

### Response Header

|Status|Header|Type|Format|Description|
|---|---|---|---|---|
|200|x-pd-trace-id|string||Trace ID returned by MAXIR AI. In case of an error in the request, this ID can be provided to the MAXIR AI team for troubleshooting.|
